Connect to a Query Engine
Prerequisites
- Have the Querybook repository cloned. See Quick Setup.
- Have a PostgreSQL database ready to connect. It could be either on your localhost or on a remote server.
General Process
- Create a query engine for query execution.
- Add the query engine to an environment. Create one first if needed.
- [Optional but highly recommended] Create a new metastore to associate with the query engine.
If you dont have an idea of above concepts of query engine, environment and metastore, please refer to here
Step by Step
Here we'll guide you through the process of adding a query engine for PostgreSQL as an example.
- Create a
local.txt
file under therequirements/
folder in the project's root directory.
touch requirements/local.txt
- Check the engine list and find the package it depends on.
- If the required package is not included by default, add it to the
local.txt
file. ForPostgreSQL
, no additional package is needed. Here is an example forAmazon Redshift
:
echo -e "sqlalchemy-redshift\nredshift_connector" > requirements/local.txt
- Start the container:
make
-
Sign up as a new user and use the demo setup. The first signed up user will be added as the admin.
-
Open the admin tool http://localhost:10001/admin
-
Click
Query Engine
to add a new query engine- Provide a name for the query engine.
- Select
Postgresql
as the language. - Select
sqlalchemy
as the executor. - Input the connection string, which should look like
Please refer to the SqlAlchemy documentation for the connection string format.
postgresql://<username>:<password>@<server-host>:<port>/<database>
- Select
SelectOneChecker
as the status checker
About localhostIf Querybook and PostgresSQL are both running on the same machine, you'll need some extra change.
Mac
Please use
host.docker.internal
instead oflocalhost
as the server address. e.g.postgresql://<username>:<password>@host.docker.internal:5432/<database>
Linux
Before step 4
make
-
update
docker-compose.yml
to addnetwork_mode=host
for below services- web
- worker
- scheduler
-
update
containers/bundled_querybook_config.yaml
to uselocalhost
instead of service names
DATABASE_CONN: mysql+pymysql://test:passw0rd@localhost:3306/querybook2?charset=utf8mb4
REDIS_URL: redis://localhost:6379/0
ELASTICSEARCH_HOST: localhost:9200Then keep using
localhost
as the server host in the connection string -
Click
Test Connection
to see if it can connect to the database correctly. If it fails, go back and check the settings above and ensure that the database server is ready for connection. You can use command-line tools likepsql
to try to connect with the same connection settings. -
Click
Save
to create the engine. -
Go to the
Environment
tab and selectdemo_environment
. You can also create a new environment if you like. -
For
Query Engines
, selectpostgresql
from the dropdown list, and clickAdd Query Engine
. -
Open http://localhost:10001/demo_environment/adhoc/. Switch to the new environment if you created a new one in step 11.
-
Try to write a test query, select
postgresql
, and run it.
That's it 🎉. Keep reading if you'd like to know how to add a metastore.
- Open http://localhost:10001/admin/metastore/
- Create a new metastore.
- Provide a name for the metastore.
- Select
SqlAlchemyMetastoreLoader
as the loader. - Input the same connection string as the query engine.
Connection String
For PostgreSQL, the metastore is the same as the database, so we're using the same connection string for both the metastore and query engine. However, this may not be the case for other engines, such as Hive Metastore + Presto.
- Click
Create
to create the metastore. - On the same page, you will see a section called
Update Schedule
. Click the buttonCreate Task
. This scheduled task is used for syncing the metadata from the metastore to Querybook periodically. - Click
Run Task
and wait until it completes. - Go to the
Query Engine
tab and select the new query enginepostgresql
. - Select
postgres_metastore
from the dropdown list for theMetastore
field and clickSave
. - Go to the
Tables
tab on page http://localhost:10001/demo_environment/. Selectpostgres_metastore
from the dropdown list. You'll see the tables synced from the metastore. If you don't see any tables, go back to step 17 and check if the connection string is correct.
Congratulations! You have successfully set up Querybook with a query engine and a metastore. You can now start exploring and analyzing your data with ease.