Skip to content

Materialized Views (Derived Topics)

BoilStream supports creating materialized views (also known as derived topics) that provide streaming SQL transformations of your data in real-time. These views automatically process and transform incoming data as it streams through your base topics, creating new output streams based on your SQL transformations.

NOTE! Currently the SQL is run over batches of messages in Arrow Table format with zero-copy. There is no tumbling/sliding window coordination or Flink style sessions. Support for these will be added later on.

What are Materialized Views?

Materialized views in BoilStream are SQL-based transformations that:

  • Process data in real-time as it arrives in your base topics
  • Create derived output streams based on your SQL logic
  • Automatically update as new data flows through the system
  • Support complex SQL operations including filtering, aggregations
    • NOTE: window functions coming based on cached on-disk DuckDB database data

Unlike traditional views that are computed on-demand, materialized views in BoilStream continuously process streaming data and maintain their results as separate topics.

Creating Materialized Views

You can create materialized views using standard SQL CREATE VIEW syntax through any DuckDB connection to BoilStream:

sql
-- Connect to BoilStream
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc+tls://localhost:50051/');

-- Create a base topic (table)
CREATE TABLE boilstream.s3.people (
    name VARCHAR,
    age INT,
    tags VARCHAR[]
);

-- Create materialized views for different transformations
CREATE VIEW boilstream.s3.filtered_adults AS
SELECT * FROM boilstream.s3.people WHERE age > 50;

CREATE VIEW boilstream.s3.filtered_names_starting_with_b AS
SELECT * FROM boilstream.s3.people WHERE name LIKE 'b%';

CREATE VIEW boilstream.s3.filtered_names_starting_with_a AS
SELECT * FROM boilstream.s3.people WHERE name LIKE 'a%';

View Naming Convention

When you create a materialized view, BoilStream uses a special naming convention to indicate the relationship between base topics and their derived views:

  • Base topic: people
  • Derived topics: people→filtered_adults, people→filtered_names_starting_with_b, etc.

The arrow symbol clearly shows which views are derived from which base topics.

Viewing Your Topics and Views

Use SHOW ALL TABLES to see both your base topics and materialized views:

sql
SHOW ALL TABLES;

Example output:

┌────────────┬─────────┬────────────────────────┬───────────────────┬─────────────────────────────────┬───────────┐
│  database  │ schema  │          name          │   column_names    │          column_types           │ temporary │
├────────────┼─────────┼────────────────────────┼───────────────────┼─────────────────────────────────┼───────────┤
│ boilstream │ s3      │ people                 │ [name, age, tags] │ [VARCHAR, INTEGER, 'VARCHAR[]'] │ false     │
│ boilstream │ s3      │ people→filtered_adults │ [result]          │ [VARCHAR]                       │ false     │
│ boilstream │ s3      │ people→filtered_b      │ [result]          │ [VARCHAR]                       │ false     │
└────────────┴─────────┴────────────────────────┴───────────────────┴─────────────────────────────────┴───────────┘

Dropping Materialized Views

You can remove materialized views using the DROP TABLE command (note: use DROP TABLE, not DROP VIEW):

sql
-- Drop a specific materialized view
DROP TABLE boilstream.s3."people→filtered_b";

-- Verify it's removed
SHOW ALL TABLES;

Important

When dropping materialized views, you must quote the table name if it contains special characters like the symbol.

Supported SQL Operations

Materialized views support a wide range of SQL operations including:

Filtering

sql
CREATE VIEW boilstream.s3.active_users AS
SELECT * FROM boilstream.s3.users WHERE status = 'active';

Projections

sql
CREATE VIEW boilstream.s3.user_summary AS
SELECT name, email, created_at FROM boilstream.s3.users;

Transformations

sql
CREATE VIEW boilstream.s3.user_profiles AS
SELECT
    name,
    age,
    CASE
        WHEN age >= 65 THEN 'Senior'
        WHEN age >= 18 THEN 'Adult'
        ELSE 'Minor'
    END as age_category
FROM boilstream.s3.users;

Aggregations

sql
CREATE VIEW boilstream.s3.daily_user_counts AS
SELECT
    DATE_TRUNC('day', created_at) as date,
    COUNT(*) as user_count
FROM boilstream.s3.users
GROUP BY DATE_TRUNC('day', created_at);

Real-time Processing

Materialized views process data in real-time as it arrives:

  1. Data arrives in your base topic (people)
  2. Views are automatically triggered to process the new data
  3. Transformed results are immediately available in the derived topics
  4. No polling or batch processing required - everything happens as data streams through

Use Cases

Materialized views are perfect for:

  • Data Filtering: Create focused streams for specific user segments
  • Data Enrichment: Add calculated fields or lookup information
  • Format Transformation: Convert data formats or structures
  • Real-time Analytics: Maintain running aggregations and metrics
  • Multi-tenant Data: Separate data streams by customer or organization
  • Event Processing: Transform raw events into business-meaningful streams

Performance Considerations

  • Views process data as it streams, so complex transformations should be designed with throughput in mind
  • Each view creates a separate output topic, consuming additional storage
  • Views are processed independently, allowing for different processing speeds
  • Consider the cardinality of your transformations when designing aggregation views

Next Steps