Looking to deploy on prem? Check out the demos here

Connecting To Streambased From DBT

Explore how to efficiently link DBT with Streambased, enhancing your data transformation workflows with real-time streaming capabilities for more dynamic and responsive data operations.

Integrate With Your Analytical Tools
June 10, 2024
Read time:
6 Minutes

Video Guide

Pre-requisites

You must have a running Streambased server before following this guide. For details on how to run Streambased see the documentation here: https://streambased-io.github.io/streambased/index.html or run one of the demos here: https://github.com/streambased-io/streambased-demos

Step 1: Install dependencies

To work with DBT we must first install the Trino libraries:

pip install dbt-trino

Step 2: Create a project

Let’s init a DBT project:

dbt init

Give the project a meaningful name and select the trino database:

Step 3: Configure the database connection

In the newly created project directory create a new file named “profiles.yml”. This should contain the following:

<project name>:
target: dev
outputs:
  dev:
    type: trino
    method: none
    user: e30=
    database: kafka
    host: localhost
    port: 8080
    schema: streambased
    threads: 1

Note: The above assumed a local Streambased server, modify the host and port according to your deployment.

Step 4: Create a model

In the “models” directory create a new file with the extension “.sql” (in this example src_customers.sql). This should contain the SQL you wish to run:

SELECT * FROM kafka.streambased.customers WHERE name = 'TOM SCOTT'

Step 5: Run the query

We can test run the query directly with DBT:

dbt show --select models/src_customers.sql

Experience lightning-fast filter queries with Streambased: achieve up to 30x speed boost!

Uncover the power of Streambased’s DataLake and unlock the potential for unparalleled efficiency and productivity. Learn more today!

Copyright 2024 Streambased Platform Limited. Company Number 14709247.