Streambased for Retail
Bring context to what your shoppers are doing right now
Modern retailers already use Kafka to capture real-time events such as clickstream activity, basket updates, payments and inventory changes, while historical customer behaviour and product demand trends are stored in Iceberg and other analytical systems. The problem is that these systems are architecturally separate, forcing teams to analyse live shopper activity without historical context, or historical behaviour without the latest signals.
Streambased removes that separation. It makes real-time events directly queryable alongside historical data so browsing behaviour, in-store transactions and demand signals can be analysed together in a single view – without copying data or operating ingestion pipelines.
Decisions that previously relied on partial signals can now be made using real-time and historical information together.
Decisions that previously relied on partial signals can now be made using real-time and historical information together.

The retail challenge:
When shopper behaviour moves faster than your data
Retail generates high-volume, high-velocity streams of operational data continuously. Clickstream events, basket updates, payments, store transactions and inventory movements flow through Kafka in real time.
At the same time, years of customer purchase history, product demand trends, pricing performance and supply chain data live in Iceberg and downstream analytics platforms.
But these two worlds remain disconnected, linked only by slow, expensive ETL pipelines that create critical gaps between customer behaviour and business response.
At the same time, years of customer purchase history, product demand trends, pricing performance and supply chain data live in Iceberg and downstream analytics platforms.
But these two worlds remain disconnected, linked only by slow, expensive ETL pipelines that create critical gaps between customer behaviour and business response.
This raises multiple challenges:
Fraud detection models built on historical purchase patterns struggle to recognise suspicious checkout activity without visibility of the shopper’s current session.
Demand spikes emerge during a promotion, but historical sales baselines cannot explain whether the surge reflects a real trend without access to live basket activity.
Personalisation engines trained on historical behaviour fail to adapt when a customer suddenly changes browsing patterns during a session.
Inventory planning models built on weeks of demand history cannot react to sudden store-level sales spikes without visibility of live transactions.
Dynamic pricing strategies derived from historical elasticity risk misfiring when they ignore the latest signals from shoppers in the moment.
Supply chain optimisation based on historical demand patterns loses meaning without understanding current store sell-through rates.
The problem isn’t that retailers lack data. It’s that the systems holding live shopper signals and historical context were never designed to work together in the moment decisions are made. Connected only through batch pipelines, the context needed to interpret real-time activity often arrives long after the opportunity to respond has passed.
Streambased removes the trade-off between speed and context by making real-time and historical data accessible together in a single, queryable view. Decisions across inventory, pricing and customer experience are made against complete and consistent data.
The Streambased solution:
Certainty, control, visibility
When operational decisions rely on partial data, even well-designed models can misfire. With Streambased, systems evaluating checkout transactions, demand spikes or service disruptions can analyse live signals alongside years of behavioural and operational history. Fraud detection becomes more reliable. Demand signals can be interpreted accurately. Customer behaviour shifts can be recognised immediately. Decisions that previously relied on incomplete information can now be made with full context.
What becomes possible:
Customer behaviour with full context: Analyse live shopper activity alongside years of purchase history rather than relying only on warehouse snapshots.
Demand signals with historical baselines: Detect emerging product demand by combining real-time basket activity with long-term sales patterns.
More reliable analytics models: Train and evaluate ML models using datasets that include both live operational signals and historical behaviour.
Faster insight cycles: Analysts can investigate emerging trends immediately instead of waiting for data pipelines to refresh the warehouse.
Retail environments change quickly: promotions trigger unexpected demand, inventory moves across stores, supply chains react to disruption.
By exposing live operational events and historical performance together, Streambased allows pricing engines, inventory systems and operational dashboards to react to changing conditions in real time.
Retailers gain the ability to adjust pricing strategies, rebalance inventory or intervene in customer journeys while the events are still unfolding.
By exposing live operational events and historical performance together, Streambased allows pricing engines, inventory systems and operational dashboards to react to changing conditions in real time.
Retailers gain the ability to adjust pricing strategies, rebalance inventory or intervene in customer journeys while the events are still unfolding.
What becomes possible:
Demand-aware pricing strategies: Pricing models can incorporate live purchase signals alongside historical demand elasticity.
Promotion optimisation: Marketing and merchandising teams can evaluate campaign performance using real-time shopper behaviour combined with historical outcomes.
Inventory response to live demand: Supply-chain systems can react to emerging sales patterns using both current transactions and historical demand forecasts.
Operational decision loops: Retail teams can adjust merchandising, promotions or fulfilment strategies as demand patterns evolve.
Real-time customer engagement: Trigger offers and journeys based on live shopper behaviour combined with historical purchase and loyalty data.
Retail operations span multiple systems: ecommerce platforms, point-of-sale systems, logistics networks and customer engagement tools.
Streambased provides a unified analytical view across these systems by allowing queries to span real-time events in Kafka and historical data in Iceberg.
Teams gain a complete timeline of customer behaviour, product demand and operational activity, from the most recent signal back through years of historical context.
Streambased provides a unified analytical view across these systems by allowing queries to span real-time events in Kafka and historical data in Iceberg.
Teams gain a complete timeline of customer behaviour, product demand and operational activity, from the most recent signal back through years of historical context.
What becomes possible:
Continuous retail analytics: Query the latest shopper signals together with years of historical data as a single analytical dataset.
Unified demand timelines: Understand how current demand compares with historical patterns across seasons, promotions and regions.
Omnichannel customer view: Analyse shopper journeys across web, mobile and in-store interactions alongside years of purchase and loyalty history.
Faster analytical exploration: Data teams can investigate trends and anomalies immediately instead of waiting for scheduled ETL refresh cycles.
Zero-copy architecture
for unified access to Kafka and Iceberg
Streambased turns Kafka from a write-only streaming backbone into a directly queryable analytical data source. By exposing Kafka topics as Iceberg-compatible tables and stitching them with existing Iceberg history, Streambased gives query engines a single logical view across real-time and historical data, without continuously copying data or running ingestion pipelines.
Streambased sits alongside your existing warehouse, complementing current ETL processes. The boundary between live shopper signals and historical retail context disappears at query time: a single SQL statement can analyse the latest browsing behaviour together with years of customer and product history, giving analysts, models and operational systems a unified view of retail activity across time.
Streambased sits alongside your existing warehouse, complementing current ETL processes. The boundary between live shopper signals and historical retail context disappears at query time: a single SQL statement can analyse the latest browsing behaviour together with years of customer and product history, giving analysts, models and operational systems a unified view of retail activity across time.
What this architecture enables:
Instant data availability
Newly created events in Kafka – clickstream activity, basket updates, checkout transactions, inventory movements and fulfilment events – become instantly queryable in BI tools, demand forecasting models and customer analytics platforms.
Match storage costs to business value
Keep the most recent operational events in Kafka for real-time decisioning, while storing years of customer behaviour, product demand history and inventory performance in cost-efficient object storage. Optimise storage for both speed and long-term analytics.
Unified governance
Your existing Kafka security model extends naturally across the analytical layer. The same access controls protecting operational data govern analytical queries, ensuring consistent governance across real-time and historical retail workloads.
Standard tool compatibility
Works natively with your existing analytics stack – merchandising dashboards, customer analytics platforms and demand forecasting models – and integrates with tools such as Snowflake, Databricks, Spark and Trino.
Talk to us
about your data stack
We'd love to learn about your operation and show you how a unified, instantly queryable view of your hot and cold data can drive measurable outcomes