Infra Installation
By default, the Querybook image build only includes the core packages that ensure it can run.
To add more integrations such as query engines, metastores, or authentications, you would need to install the Python packages yourself.
Querybook comes with a set of integrations that can be auto included once the Python packages are installed. It also supports plugins for you to add additional integrations that are not yet included.
Out-of-box support
Here are all supported integrations included by default:
- Query Engines:
- Firebird
- Mysql
- Sqlite
- Postgresql
- Oracle
- Mssql
- Metastore:
- MysqlMetastore
- SqlalchemyMetastore
- Authentication:
- Username/Password
- Exporter:
- python exporter
- Result Store (Persisting query result):
- db
- file
- Elasticsearch:
- custom hosted
Integrations that can be supported via package install
The public docker image includes all of the custom dependencies listed below.
You can also include all subdependencies by adding -r extra.txt in requirements/local.txt or putting EXTRA_PIP_INSTALLS=extra.txt in docker build args.
If you install the required packages, these integrations will be automatically supported:
- Query Engines:
- BigQuery (via
-r engine/bigquery.txt) - Druid (via
-r engine/druid.txt) - Hive (via
-r engine/hive.txt) - Presto (via
-r engine/presto.txt) - Redshift (via
-r engine/redshift.txt) - Snowflake (via
-r engine/snowflake.txt) - Trino (via
-r engine/trino.txt) - And any sqlalchemy supported engines
- BigQuery (via
- Metastore:
- Hive Metastore (via
-r metastore/hms.txt) - Hive Metastore with Thrift (install Hive Metastore and Hive)
- Glue (via
-r metastore/glue.txt)
- Hive Metastore (via
- Authentication:
- Install
-r auth/oauth.txtto use the following:- Azure oauth
- Github oauth
- Google oauth
- Okta oauth
- Generic oauth
- LDAP (via
-r auth/ldap.txt)
- Install
- Exporter:
- Google Sheet Exporter (via
-r exporter/gspread.txt)
- Google Sheet Exporter (via
- (Experimental) Table Upload:
- Parquet (via
-r exporter/parquet.txt)
- Parquet (via
- Result Store:
- AWS S3 (via
-r platform/aws.txt) - Google GCS (via
-r platform/gcp.txt)
- AWS S3 (via
- Elasticsearch:
- AWS hosted (via
-r platform/aws.txt)
- AWS hosted (via
- Parsing (transpilation):
- SQLGlot (supported by default)
How to install packages for integration
There are two ways to install addition python packages in Querybook:
- Install via requirements/local.txt.
- Install by extending the prod image.
We will go over a simple example that installs Presto and OAuth for Querybook.
Install via requirements/local.txt
Create a local.txt under requirements/ folder in the Querybook project's root directory
touch requirements/local.txt
Add the follow lines to local.txt
-r engine/presto.txt
-r engine/oauth.txt
Check out requirements/engine/presto.txt to see what python packages are installed to enable Presto support.
Alternatively, you can supply a different package version base on your need:
PyHive[presto]==0.6.3
requests-oauthlib==1.0.0
Now you can build the docker image and publish it to dockerhub or aws ecr.
make dev_image # for dev
make prod_image # for prod
docker tag ...
docker push ...
Install by extending the prod image
This part is very similar to plugin installation. Follow the steps to create a custom repo. Now add the requirements file somewhere, in this example we will put the requirements file called 'custom_deps.txt' at the plugins project root. In the requirements file, put the following:
-r engine/presto.txt
-r engine/oauth.txt
As mentioned in the previous example, you can also reference the packages such as pyhive directly.
Add the following steps in the Dockerfile of the plugins project:
COPY custom_deps.txt /opt/querybook/requirements/custom_deps.txt
RUN pip install -r /opt/querybook/requirements/custom_deps.txt
Additional integrations
If you need to integrate with anything that is not listed above, please refer to the plugins guide to learn how to add them.