Looking to deploy on prem? Check out the demos here

Connecting To Streambased From Jupyter

A step-by-step guide to integrating Jupyter with Streambased, unlocking powerful capabilities for interactive data exploration and analysis on streaming data.

Integrate With Your Analytical Tools
June 10, 2024
Read time:
3 Minutes

Video Guide

Pre-requisites

‍

You must have a running Streambased server before following this guide.

For details on how to run Streambased see the documentation here: https://streambased-io.github.io/streambased/index.html Β or run one of the demos here: https://github.com/streambased-io/streambased-demos

‍

Step 1: Install dependencies

‍

Streambased requires the following python packages:

‍

pip install jupyterlab
pip install jupysql
pip install sqlalchemy-trino

‍

Step 2: Start the notebook

‍

Launch a notebook directly with:

‍

jupyter lab

‍

Step 3: Load the SQL extension#

‍

From your notebook load the SQL extension:

‍

%load_ext sql

‍

Step 4: Connect to Streambased

‍

Next connect to your Streambased Server

‍

%sql trino://localhost:8080/kafka

‍

Note: This assumes that your Streambased Server is running locally, adjust the host and port according to your deployment.

‍

Step 5: Run a query

‍

Now we can run a query:

‍

%sql SELECT * FROM kafka.streambased.customers WHERE name = 'TOM SCOTT'

‍

‍

Step 6: (optional) Pandas?

‍

To use Streambased with pandas first return to step 1 and ensure you have the pandas library installed:

‍

pip install pandas

‍

Then to use pandas simply change your query result into a DataFrame and work away:

‍

result = %sql SELECT * FROM kafka.streambased.customers WHERE name = 'TOM SCOTT'
df = result.DataFrame()

‍

‍

Experience lightning-fast filter queries with Streambased: achieve up to 30x speed boost!

Uncover the power of Streambased’s DataLake and unlock the potential for unparalleled efficiency and productivity. Learn more today!

Copyright 2024 Streambased Platform Limited. Company Number 14709247.