Connecting To Streambased From Jupyter

Pre-requisites

‍

You must have a running Streambased server before following this guide.

For details on how to run Streambased see the documentation here: https://streambased-io.github.io/streambased/index.html or run one of the demos here: https://github.com/streambased-io/streambased-demos

‍

Step 1: Install dependencies

‍

Streambased requires the following python packages:

‍

pip install jupyterlab
pip install jupysql
pip install sqlalchemy-trino

‍

Step 2: Start the notebook

‍

Launch a notebook directly with:

‍

jupyter lab

‍

Step 3: Load the SQL extension#

‍

From your notebook load the SQL extension:

‍

%load_ext sql

‍

Step 4: Connect to Streambased

‍

Next connect to your Streambased Server

‍

%sql trino://localhost:8080/kafka

‍

Note: This assumes that your Streambased Server is running locally, adjust the host and port according to your deployment.

‍

Step 5: Run a query

‍

Now we can run a query:

‍

%sql SELECT * FROM kafka.streambased.customers WHERE name = 'TOM SCOTT'

‍

‍

Step 6: (optional) Pandas?

‍

To use Streambased with pandas first return to step 1 and ensure you have the pandas library installed:

‍

pip install pandas

‍

Then to use pandas simply change your query result into a DataFrame and work away:

‍

result = %sql SELECT * FROM kafka.streambased.customers WHERE name = 'TOM SCOTT'
df = result.DataFrame()

‍

‍

Connecting To Streambased From Jupyter

Video Guide

Pre-requisites

Step 1: Install dependencies

Step 2: Start the notebook

Step 3: Load the SQL extension#

Step 4: Connect to Streambased

Step 5: Run a query

Step 6: (optional) Pandas?

Related Tutorials

Connecting to A.S.K. with DBT

Connecting to A.S.K with Jupyter

Connecting to A.S.K with Generic ODBC

Connecting to A.S.K. with Generic JDBC

Experience lightning-fast filter queries with Streambased: achieve up to 30x speed boost!