Skip to content

Quick Start

Get up and running with BoilStream in under 5 minutes.

Choose Your Version

  • Free Version: Basic streaming without authentication or TLS
  • Pro Version: Enterprise SSO, TLS encryption, and advanced features

Prerequisites

BoilStream

  • Cloud storage backend (AWS S3, Azure Blob Storage, Google Cloud Storage, MinIO, or local filesystem)
  • Linux or macOS
  • arm64 or x86

DuckDB Client

Pro Version Additional Requirements

  • TLS certificates (self-signed or CA-issued)
  • JWT identity provider (AWS Cognito, Azure AD, Google Cloud, Auth0, or Okta)

Installation

Download Binary

Download the latest release from GitHub:

Latest Release

Download pre-built binaries from the BoilStream GitHub Releases page.

Available for:

  • Linux (x86_64 and arm64)
  • macOS (Intel and Apple Silicon)

Getting Started

Zero-Configuration Startup

BoilStream is designed for easy setup. Simply run the binary and it will automatically generate a config.yaml file with sensible defaults:

bash
# First run - automatically generates config.yaml
./boilstream

On first run, BoilStream will:

  1. Generate a config.yaml file in the current directory
  2. Configure default settings for local development
  3. Start the server with these defaults

Customizing Configuration

After the initial run, you can edit the generated config.yaml file to customize your setup:

bash
# Edit the auto-generated configuration
vi config.yaml

# Restart BoilStream with your changes
./boilstream

You can also specify a different configuration file:

bash
./boilstream --config production.yaml

Connect from DuckDB

Open DuckDB and connect to BoilStream:

sql
-- Install and load the airport extension (if not already done)
INSTALL airport FROM community;
LOAD airport;

-- Connect to BoilStream (no authentication)
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc://localhost:50051/');

-- List topics
SHOW ALL TABLES;

Pro Version Setup

Configure Authentication

For the Pro version, edit the generated config.yaml to enable authentication:

yaml
# config.yaml
auth:
  providers: ["cognito"]  # Enable AWS Cognito
  authorization_enabled: true
  admin_groups: ["admin"]
  read_only_groups: ["readonly"]
  
  cognito:
    user_pool_id: "eu-west-1_gti5vAfvC"
    region: "eu-west-1"
    audience: "7ml47mngu8lrcb198epbkkfi5s"

tls:
  disabled: false  # Enable TLS for production
  cert_path: "/path/to/cert.pem"
  key_path: "/path/to/key.pem"

Then start BoilStream:

bash
./boilstream

Connect from DuckDB (Pro Version)

The pro version requires TLS and authentication:

sql
-- Install and load the airport extension
INSTALL airport FROM community;
LOAD airport;

-- Create authentication secret with your JWT token
CREATE SECRET boilstream_auth (
    type airport,
    auth_token 'eyJraWQiOiJ6YWZsU0RYQnorRWFyQUgyc1Nwa2pBZE5ja0JoZjVwQUtPTnNjZzlpSW04PSIsImFsZyI6IlJTMjU2In0.eyJzdWIiOiIyMmU1MTQ5NC05MGYxLTcwNGUtYTRkNS0yNTQwNWM5Njk3ODQiLCJjb2duaXRvOmdyb3VwcyI6WyJhZG1pbiJdLCJpc3MiOiJodHRwczpcL1wvY29nbml0by1pZHAuZXUtd2VzdC0xLmFtYXpvbmF3cy5jb21cL2V1LXdlc3QtMV9ndGk1dkFmdkMiLCJjbGllbnRfaWQiOiI3bWw0N21uZ3U4bHJjYjE5OGVwYmtrZmk1cyIsIm9yaWdpbl9qdGkiOiIzMGM2MWZiYy05ZWNlLTQ5YzEtYjdhYy0wOGM3ZTM2YmNiYjQiLCJldmVudF9pZCI6IjZlMDc5NmQ4LWU1ZDktNGNkNS04MDQ3LTYyMTdmNzczMWViZiIsInRva2VuX3VzZSI6ImFjY2VzcyIsInNjb3BlIjoiYXdzLmNvZ25pdG8uc2lnbmluLnVzZXIuYWRtaW4iLCJhdXRoX3RpbWUiOjE3NDk4MTcwMTIsImV4cCI6MTc0OTgyMDYxMiwiaWF0IjoxNzQ5ODE3MDEyLCJqdGkiOiJlMzMwYzQyZC0xZmQ3LTQ2ZWMtYTMyNy0yNGNmMWE3MThlZGQiLCJ1c2VybmFtZSI6InRlc3R1c2VyIn0.QUyp--JBCcmqk787oRoeJYP9b35kmInEdfqOpj_lCh7-oqr7lzMrt_xCxhicxGwElwkoUxEzvlRVHNegwwFIwJXepM8TuMNMbQV0NPZxUnM5r8pGeDWjgqHQKrJMnTPUXJZOoIUtJuQUDqlZHRoCzZNaPgj54qKSAQpHl8XXsghGPtzfxMpIvSfe19ojRunI77O0CYm_MD9snu3bU1FyoteRMkpDReL4ZC7b_mSPM6Bw3Pa0QdUnL1lyEIWUCjm2cS13ToMR3A86qo-lf8IazG5FqnYqvg2CzSJBe9fJEGRl7g2bDzsAqH67ImIS9of1vnYHDYWFAZhp7wPPUuk1fQ',
    scope 'grpc+tls://localhost:50051/'
);

-- Connect to BoilStream with TLS and authentication
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc+tls://localhost:50051/');

-- List topics (requires read permissions)
SHOW ALL TABLES;

TLS Certificate Setup

For the pro version with TLS, you need to set up certificates. For development, you can generate self-signed certificates:

bash
# Generate self-signed certificate (development only)
mkdir -p certs
openssl req -x509 -newkey rsa:4096 -keyout certs/server.key -out certs/server.crt -days 365 -nodes -subj "/CN=localhost"

# Set environment variable for DuckDB client
export GRPC_DEFAULT_SSL_ROOTS_FILE_PATH=certs/server.crt

Getting JWT Tokens

To get JWT tokens for authentication, see our provider-specific guides:

PostgreSQL Access Setup

BoilStream provides a web-based authentication system for PostgreSQL connections, enabling BI tools and SQL clients to connect with temporary credentials.

Enable PostgreSQL Web Authentication

Edit your config.yaml:

yaml
# config.yaml
auth_server:
  enabled: true
  port: 443
  session_ttl_hours: 8
  users_db_path: "data/users.duckdb"
  encryption_key_path: "encryption.key"
  webauthn_rp_id: "localhost"
  webauthn_rp_origin: "https://localhost"

oauth_providers:
  github:
    client_id: "your-github-client-id"
    client_secret: "your-github-client-secret"
    redirect_uri: "https://localhost/auth/callback"
    allowed_orgs: ["your-org"]

pgwire:
  enabled: true
  port: 5432

First Startup

When you start BoilStream for the first time with auth_server.enabled: true:

bash
./boilstream
# Prompt: "Enter encryption key (press Enter to generate random): "
# Press Enter to auto-generate
# Key saved to encryption.key

# Prompt: "Set superadmin password (min 12 characters): "
# Enter password for superadmin account

Subsequent runs will load the encryption key automatically and start without prompts.

Login via Web UI

  1. Navigate to the authentication portal:

    https://localhost/auth
  2. Choose login method:

    • Sign in with GitHub (if configured)
    • Sign in with Google (if configured)
    • Email + Password (sign up first)

    Login Screen

  3. After login, the dashboard displays PostgreSQL connection details:

    Host: localhost
    Port: 5432
    Database: boilstream
    Username: your.email@company.com
    Password: [Auto-generated session password]

    Dashboard with Credentials

Connect with PostgreSQL Clients

psql:

bash
psql -h localhost -p 5432 -U your.email@company.com -d boilstream
# Enter password from web dashboard

Credentials Display

DBeaver:

  1. New Connection → PostgreSQL
  2. Host: localhost, Port: 5432, Database: boilstream
  3. Username: your.email@company.com
  4. Password: From web dashboard

Power BI:

  1. Get Data → PostgreSQL database
  2. Server: localhost:5432, Database: boilstream
  3. Username and password from dashboard

Superadmin Access

For direct user management, use the superadmin account:

bash
psql -h localhost -p 5432 -U boilstream -d boilstream
# Enter superadmin password (set during first run)

See PostgreSQL Web Authentication for admin operations.

For complete PostgreSQL authentication setup, see PostgreSQL Web Authentication.

Connect via PostgreSQL Protocol

BoilStream includes a built-in PostgreSQL wire protocol server that allows you to connect with any PostgreSQL-compatible client, including BI tools like DBeaver, Tableau, and command-line tools like psql.

Quick PostgreSQL Connection

bash
# Connect with psql
psql -h localhost -p 5432 -U boilstream -d boilstream

# Connection string format
psql "postgresql://boilstream:boilstream@localhost:5432/boilstream"

DBeaver Connection

  1. Create a new PostgreSQL connection
  2. Host: localhost, Port: 5432
  3. Database: boilstream
  4. Username: boilstream, Password: boilstream

Query Your Streaming Data

sql
-- List all available topics
SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public';

-- Query recent data
SELECT * FROM your_topic_name
ORDER BY event_time DESC
LIMIT 100;

-- Real-time aggregation
SELECT
    date_trunc('hour', event_time) as hour,
    count(*) as events,
    avg(value) as avg_value
FROM sensor_data
WHERE event_time >= now() - interval '24 hours'
GROUP BY hour
ORDER BY hour;

PostgreSQL Protocol Benefits

  • Universal Compatibility: Works with any PostgreSQL client
  • BI Tool Integration: Connect Tableau, Power BI, Grafana directly
  • Cursor Support: Efficient handling of large result sets
  • Prepared Statements: Full parameter binding support
  • Real-time Analytics: Query streaming data as it arrives
  • TLS Encryption: Available in Pro tier for secure connections

See the PostgreSQL Interface Guide for comprehensive setup instructions and advanced features.

Your First Stream

Create Topic and Stream Data

The SQL commands are identical for both free and pro versions - only the connection setup differs:

sql
-- Already connected with: ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc://localhost:50051/');

-- Create sample topic (if not already exists)
CREATE TABLE boilstream.s3.people (
    name VARCHAR,
    age INT,
    tags VARCHAR[]
);

-- Stream to BoilStream
INSERT INTO boilstream.s3.people
SELECT
    'boilstream_' || i::VARCHAR as name,
    (i % 100) + 1 as age,
    ['airport', 'ducklake'] as tags
FROM generate_series(1, 1000) as t(i);
sql
-- Already connected with: ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc+tls://localhost:50051/');
-- Using authentication secret for secure access

-- Create sample topic (requires write permissions)
CREATE TABLE boilstream.s3.people (
    name VARCHAR,
    age INT,
    tags VARCHAR[]
);

-- Stream to BoilStream (requires write permissions)
INSERT INTO boilstream.s3.people
SELECT
    'secure_' || i::VARCHAR as name,
    (i % 100) + 1 as age,
    ['airport', 'enterprise'] as tags
FROM generate_series(1, 1000) as t(i);

Verify the Data

Check that your data landed in S3:

bash
aws s3 ls s3://my-data-lake/events/

Query the data from your data lake:

sql
-- Query directly from S3
SELECT
    event_type,
    count(*) as count,
    avg(value) as avg_value
FROM 's3://my-data-lake/events/*.parquet'
GROUP BY event_type;

What Just Happened?

text
1. BoilStream received your data via FlightRPC (unencrypted)
2. Validated the schema and data (no authentication required)
3. Optimized the data into Parquet format
4. Uploaded directly to S3 with multipart uploads
5. Acknowledged completion back to DuckDB

Your data is now immediately available for analytics!
text
1. BoilStream received your data via FlightRPC (TLS encrypted)
2. Authenticated your JWT token and verified permissions
3. Validated the schema and data with authorized access
4. Optimized the data into Parquet format
5. Uploaded directly to S3 with multipart uploads
6. Acknowledged completion back to DuckDB

Your data is now securely stored and immediately available for analytics!

Next Steps

For Free Version Users

For Pro Version Users

Upgrade to Pro

Ready to secure your production deployment? Contact us for BoilStream Pro licensing and enterprise support.