Skip to main content

Data Lineage

Overview

Lineage plugin allows you to extend Querybook to fetch lineage information from a custom data lineage backend. Querybook can also update your custom lineage backend with lineage connections created through Querybook.

Adding a new Lineage backend

To add support for your custom data lineage backend, create a new python file in the plugins directory and implement the following functions:

  • create_table_lineage_from_metadata(job_metadata_id, query_language=None, session=None) -> List[int]: Used to update the custom lineage backend with lineage connections created from Querybook. Persists all lineage connections, created by the job with the input id, in the lineage backend.
  • add_table_lineage(table_id, parent_table_id, job_metadata_id, commit=True, session=None) -> TableLineage: Used to update the custom lineage backend with lineage connections created from Querybook. Persists the lineage connection, between the input table and the input parent table, in the lineage backend. Returns a TableLineage object corresponding the new lineage connection.
  • get_table_parent_lineages(table_id, session=None) -> List[TableLineage]: Used to get the lineage information from the custom lineage backend. Returns a list of TableLineage objects corresponding to parent tables of the input table.
  • get_table_child_lineages(table_id, session=None) -> List[TableLineage]: Used to get the lineage information from the custom lineage backend. Returns a list of TableLineage objects corresponding to child tables of the input table.

Put the plugin in the plugins directory (see Plugins Guide for more details) and set this env variable to tell Querybook to use your plugin for data lineage purposes:

DATA_LINEAGE_BACKEND: 'path.to.plugin' # path to the plugin relative to the plugins directory