Materialized Views (Derived Topics)
BoilStream supports creating materialized views (also known as derived topics) that provide streaming SQL transformations of your data in real-time. These views automatically process and transform incoming data as it streams through your base topics, creating new output streams based on your SQL transformations.
NOTE! Currently the SQL is run over batches of messages in Arrow Table format with zero-copy. There is no tumbling/sliding window coordination or Flink style sessions. Support for these will be added later on.
What are Materialized Views?
Materialized views in BoilStream are SQL-based transformations that:
- Process data in real-time as it arrives in your base topics
- Create derived output streams based on your SQL logic
- Automatically update as new data flows through the system
- Support complex SQL operations including filtering, aggregations
- NOTE: window functions coming based on cached on-disk DuckDB database data
Unlike traditional views that are computed on-demand, materialized views in BoilStream continuously process streaming data and maintain their results as separate topics.
Creating Materialized Views
You can create materialized views using standard SQL CREATE VIEW
syntax through any DuckDB connection to BoilStream:
-- Connect to BoilStream
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc+tls://localhost:50051/');
-- Create a base topic (table)
CREATE TABLE boilstream.s3.people (
name VARCHAR,
age INT,
tags VARCHAR[]
);
-- Create materialized views for different transformations
CREATE VIEW boilstream.s3.filtered_adults AS
SELECT * FROM boilstream.s3.people WHERE age > 50;
CREATE VIEW boilstream.s3.filtered_names_starting_with_b AS
SELECT * FROM boilstream.s3.people WHERE name LIKE 'b%';
CREATE VIEW boilstream.s3.filtered_names_starting_with_a AS
SELECT * FROM boilstream.s3.people WHERE name LIKE 'a%';
View Naming Convention
When you create a materialized view, BoilStream uses a special naming convention to indicate the relationship between base topics and their derived views:
- Base topic:
people
- Derived topics:
people→filtered_adults
,people→filtered_names_starting_with_b
, etc.
The →
arrow symbol clearly shows which views are derived from which base topics.
Viewing Your Topics and Views
Use SHOW ALL TABLES
to see both your base topics and materialized views:
SHOW ALL TABLES;
Example output:
┌────────────┬─────────┬────────────────────────┬───────────────────┬─────────────────────────────────┬───────────┐
│ database │ schema │ name │ column_names │ column_types │ temporary │
├────────────┼─────────┼────────────────────────┼───────────────────┼─────────────────────────────────┼───────────┤
│ boilstream │ s3 │ people │ [name, age, tags] │ [VARCHAR, INTEGER, 'VARCHAR[]'] │ false │
│ boilstream │ s3 │ people→filtered_adults │ [result] │ [VARCHAR] │ false │
│ boilstream │ s3 │ people→filtered_b │ [result] │ [VARCHAR] │ false │
└────────────┴─────────┴────────────────────────┴───────────────────┴─────────────────────────────────┴───────────┘
Dropping Materialized Views
You can remove materialized views using the DROP TABLE
command (note: use DROP TABLE
, not DROP VIEW
):
-- Drop a specific materialized view
DROP TABLE boilstream.s3."people→filtered_b";
-- Verify it's removed
SHOW ALL TABLES;
Important
When dropping materialized views, you must quote the table name if it contains special characters like the →
symbol.
Supported SQL Operations
Materialized views support a wide range of SQL operations including:
Filtering
CREATE VIEW boilstream.s3.active_users AS
SELECT * FROM boilstream.s3.users WHERE status = 'active';
Projections
CREATE VIEW boilstream.s3.user_summary AS
SELECT name, email, created_at FROM boilstream.s3.users;
Transformations
CREATE VIEW boilstream.s3.user_profiles AS
SELECT
name,
age,
CASE
WHEN age >= 65 THEN 'Senior'
WHEN age >= 18 THEN 'Adult'
ELSE 'Minor'
END as age_category
FROM boilstream.s3.users;
Aggregations
CREATE VIEW boilstream.s3.daily_user_counts AS
SELECT
DATE_TRUNC('day', created_at) as date,
COUNT(*) as user_count
FROM boilstream.s3.users
GROUP BY DATE_TRUNC('day', created_at);
Real-time Processing
Materialized views process data in real-time as it arrives:
- Data arrives in your base topic (
people
) - Views are automatically triggered to process the new data
- Transformed results are immediately available in the derived topics
- No polling or batch processing required - everything happens as data streams through
Use Cases
Materialized views are perfect for:
- Data Filtering: Create focused streams for specific user segments
- Data Enrichment: Add calculated fields or lookup information
- Format Transformation: Convert data formats or structures
- Real-time Analytics: Maintain running aggregations and metrics
- Multi-tenant Data: Separate data streams by customer or organization
- Event Processing: Transform raw events into business-meaningful streams
Performance Considerations
- Views process data as it streams, so complex transformations should be designed with throughput in mind
- Each view creates a separate output topic, consuming additional storage
- Views are processed independently, allowing for different processing speeds
- Consider the cardinality of your transformations when designing aggregation views
Next Steps
- Learn about Authentication to secure your BoilStream deployment
- Explore the API Reference for programmatic access
- Check out Getting Started for a complete setup walkthrough