添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

DatabaseInfo

Contains database specific information needed to process SQL statement parse result.

SQLParser

Interface for openlineage-sql.

class airflow.providers.openlineage.sqlparser. GetTableSchemasParams [source]

Bases: airflow.typing_compat.TypedDict

get_table_schemas params.

normalize_name : Callable [ [ str ] , str ] [source] class airflow.providers.openlineage.sqlparser. DatabaseInfo [source]

Contains database specific information needed to process SQL statement parse result.

Parameters
  • scheme – Scheme part of URI in OpenLineage namespace.

  • authority – Authority part of URI in OpenLineage namespace. For most cases it should return {host}:{port} part of Airflow connection. See: https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md

  • database – Takes precedence over parsed database name.

  • information_schema_columns – List of columns names from information schema table.

  • information_schema_table_name – Information schema table name.

  • use_flat_cross_db_query – Specifies if single information schema table should be used for cross-database queries (e.g. for Redshift).

  • is_information_schema_cross_db – Specifies if information schema contains cross-database data.

  • is_uppercase_names – Specifies if database accepts only uppercase names (e.g. Snowflake).

  • normalize_name_method – Method to normalize database, schema and table names. Defaults to name.lower() .

  • class airflow.providers.openlineage.sqlparser. SQLParser ( dialect = None , default_schema = None ) [source]

    Bases: airflow.utils.log.logging_mixin.LoggingMixin

    Interface for openlineage-sql.

    Parameters
  • dialect ( str | None ) – dialect specific to the database

  • default_schema ( str | None ) – schema applied to each table with no schema parsed

  • parse_table_schemas ( hook , inputs , outputs , database_info , namespace = DEFAULT_NAMESPACE , database = None , sqlalchemy_engine = None ) [source]

    Parse schemas for input and output tables.

    attach_column_lineage ( datasets , database , parse_result ) [source]

    Attaches column lineage facet to the list of datasets.

    Note that currently each dataset has the same column lineage information set. This would be a matter of change after OpenLineage SQL Parser improvements.

    generate_openlineage_metadata_from_sql ( sql , hook , database_info , database = None , sqlalchemy_engine = None , use_connection = True ) [source]

    Parse SQL statement(s) and generate OpenLineage metadata.

    Generated OpenLineage metadata contains:

  • input tables with schemas parsed

  • output tables with schemas parsed

  • run facets

  • job facets.

  • Parameters
  • sql ( list [ str ] | str ) – a SQL statement or list of SQL statement to be parsed

  • hook ( airflow.hooks.base.BaseHook ) – Airflow Hook used to connect to the database

  • database_info ( DatabaseInfo ) – database specific information

  • database ( str | None ) – when passed it takes precedence over parsed database name

  • sqlalchemy_engine ( sqlalchemy.engine.Engine | None ) – when passed, engine’s dialect is used to compile SQL queries

  • classmethod split_sql_string ( sql ) [source]

    Split SQL string into list of statements.

    Tries to use DbApiHook.split_sql_string if available. Otherwise, uses the same logic.

    create_information_schema_query ( tables , normalize_name , is_cross_db , information_schema_columns , information_schema_table , is_uppercase_names , use_flat_cross_db_query , database = None , sqlalchemy_engine = None ) [source]

    Create SELECT statement to query information schema table.

  • DEFAULT_INFORMATION_SCHEMA_COLUMNS
  • DEFAULT_INFORMATION_SCHEMA_TABLE_NAME
  • default_normalize_name_method()
  • GetTableSchemasParams
  •