Holder

LineageHolder is an abstraction to hold the lineage result analyzed by LineageAnalyzer at different level.

At the bottom, we have sqllineage.core.holders.SubQueryLineageHolder to hold lineage at subquery level. This is used internally by sqllineage.core.analyzer.LineageAnalyzer.

LineageAnalyzer generates sqllineage.core.holder.StatementLineageHolder as the result of lineage at SQL statement level.

To assemble multiple sqllineage.core.holder.StatementLineageHolder into a DAG based data structure serving for the final output, we have sqllineage.core.holders.SQLLineageHolder

SubQueryLineageHolder

class sqllineage.core.holders.SubQueryLineageHolder[source]

SubQuery/Query Level Lineage Result.

SubQueryLineageHolder will hold attributes like read, write, cte.

Each of them is a Set[sqllineage.core.models.Table].

This is the most atomic representation of lineage result.

property write_columns: List[Column]

return a list of columns that write table contains. It’s either manually added via add_write_column if specified in DML or automatic added via add_column_lineage after parsing from SELECT

add_write_column(*tgt_cols: Column) None[source]

in case of DML with column specified, like:

INSERT INTO tab1 (col1, col2)
SELECT col3, col4

this method is called to make sure tab1 has column col1 and col2 instead of col3 and col4

add_column_lineage(src: Column, tgt: Column) None[source]

link source column to target.

get_alias_mapping_from_table_group(table_group: List[Path | Table | SubQuery]) Dict[str, Path | Table | SubQuery][source]

A table can be referred to as alias, table name, or database_name.table_name, create the mapping here. For SubQuery, it’s only alias then.

StatementLineageHolder

class sqllineage.core.holders.StatementLineageHolder[source]

Statement Level Lineage Result.

Based on SubQueryLineageHolder, StatementLineageHolder holds extra attributes like drop and rename

For drop, it is a Set[sqllineage.core.models.Table].

For rename, it a Set[Tuple[sqllineage.core.models.Table, sqllineage.core.models.Table]], with the first table being original table before renaming and the latter after renaming.

SQLLineageHolder

class sqllineage.core.holders.SQLLineageHolder(graph: DiGraph)[source]

The combined lineage result in representation of Directed Acyclic Graph.

Parameters:

graph – the Directed Acyclic Graph holding all the combined lineage result.

property table_lineage_graph: DiGraph

The table level DiGraph held by SQLLineageHolder

property column_lineage_graph: DiGraph

The column level DiGraph held by SQLLineageHolder

property source_tables: Set[Table]

a list of source sqllineage.core.models.Table

property target_tables: Set[Table]

a list of target sqllineage.core.models.Table

property intermediate_tables: Set[Table]

a list of intermediate sqllineage.core.models.Table

static of(metadata_provider, *args: StatementLineageHolder) SQLLineageHolder[source]

To assemble multiple sqllineage.core.holders.StatementLineageHolder into sqllineage.core.holders.SQLLineageHolder