We are speaking at Kafka Summit!

Connecting to Streambased
from Jupyter

A step-by-step guide to integrating Jupyter with Streambased, unlocking powerful capabilities for interactive data exploration and analysis on streaming data.
Pradeep Sekar

Table of Contents

Share This Tutorial


You must have a running Streambased server before following this guide.

For details on how to run Streambased see the documentation here: https://streambased-io.github.io/streambased/index.html  or run one of the demos here: https://github.com/streambased-io/streambased-demos

Step 1: Install dependencies

Streambased requires the following python packages:

					pip install jupyterlab
pip install jupysql
pip install sqlalchemy-trino


Step 2: Start the notebook

Launch a notebook directly with:
					jupyter lab

Step 3: Load the SQL extension

From your notebook load the SQL extension:

					%load_ext sql

Step 4: Connect to Streambased

Next connect to your Streambased Server
					%sql trino://localhost:8080/kafka
Note: This assumes that your Streambased Server is running locally, adjust the host and port according to your deployment.

Step 5: Run a query

Now we can run a query:

					%sql SELECT * FROM kafka.streambased.customers WHERE name = 'TOM SCOTT'

Step 6: (optional) Pandas?

To use Streambased with pandas first return to step 1 and ensure you have the pandas library installed:
					pip install pandas
Then to use pandas simply change your query result into a DataFrame and work away:
					result = %sql SELECT * FROM kafka.streambased.customers WHERE name = 'TOM SCOTT'
df = result.DataFrame()

Share This Tutorial