Quick Start
Get up and running with BoilStream in under 5 minutes.
Prerequisites
- Linux or macOS (x64 or arm64)
- 8GB+ RAM recommended
- Cloud storage backend (AWS S3, Azure Blob, GCS, MinIO, or filesystem)
For DuckDB Clients
Installation
Download the binaries:
bash
# Linux/macOS (darwin-aarch64, darwin-x64, linux-aarch64, linux-x64)
curl -L -o boilstream https://www.boilstream.com/binaries/darwin-aarch64/boilstream-0.8.0
curl -L -o boilstream-admin https://www.boilstream.com/binaries/darwin-aarch64/boilstream-admin-0.8.0
chmod +x boilstream boilstream-adminZero-Configuration Startup
BoilStream generates a config.yaml on first run:
bash
./boilstreamDevelopment Mode
The auto-generated config uses NoOp storage backend (data not persisted to S3) and paths under /tmp (data lost on restart). For production, configure a real S3/Azure/GCS backend in config.yaml.
On first run with auth server enabled, you'll be prompted to:
- Set encryption key (press Enter to auto-generate)
- Set superadmin password (min 12 characters)
Connect via PostgreSQL
Get your credentials from the Web Auth GUI at https://localhost/auth, then:
bash
# Connect with psql (use credentials from Web Auth GUI)
psql -h localhost -p 5432 -U your.email@company.com -d your_ducklake
# Or use DBeaver, Tableau, Power BI, Grafana...Connect via DuckDB (Airport)
sql
INSTALL airport FROM community;
LOAD airport;
-- Connect to BoilStream
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc://localhost:50051/');
-- With TLS
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc+tls://localhost:50051/');Connect via boilstream Extension
For managed DuckLakes with credential vending:
sql
INSTALL boilstream FROM community;
LOAD boilstream;
-- Login with email/password/MFA
PRAGMA boilstream_login('https://your-server.com/user@example.com', 'password', '123456');
-- Or use bootstrap token
PRAGMA boilstream_bootstrap_session('https://your-server.com/secrets:<token>');
-- List your DuckLakes
FROM boilstream_ducklakes();Create a Streaming DuckLake
Streaming DuckLakes use the __stream suffix:
sql
-- Via boilstream extension
PRAGMA boilstream_create_ducklake('events__stream', 'Event streaming catalog');
-- Create a table (becomes an ingestion topic)
USE events__stream;
CREATE TABLE activity (
user_id VARCHAR,
event_type VARCHAR,
timestamp TIMESTAMP,
payload JSON
);Ingest Data
Via DuckDB Airport
sql
INSTALL airport FROM community;
LOAD airport;
ATTACH 'events__stream' (TYPE AIRPORT, location 'grpc://localhost:50051/');
INSERT INTO events__stream.main.activity
SELECT
'user_' || i::VARCHAR AS user_id,
CASE WHEN i % 3 = 0 THEN 'click' ELSE 'view' END AS event_type,
NOW() AS timestamp,
'{"page": "home"}' AS payload
FROM generate_series(1, 10000) AS t(i);When INSERT returns, data is guaranteed on S3.
Via Kafka
properties
bootstrap.servers=localhost:9092
schema.registry.url=http://localhost:8081Via HTTP/2 Arrow
bash
POST https://localhost:8443/ingest/{token}
Content-Type: application/vnd.apache.arrow.streamQuery Your Data
Real-time (Hot Tier)
Query via PostgreSQL for ~1 second data visibility:
sql
-- psql or any PostgreSQL client
SELECT event_type, COUNT(*)
FROM events__stream.main.activity
WHERE timestamp > NOW() - INTERVAL '5 minutes'
GROUP BY event_type;Historical (Cold Tier)
Query S3 Parquet directly:
sql
SELECT * FROM 's3://your-bucket/events/*.parquet' LIMIT 100;Web Auth GUI
Access the dashboard at https://localhost/:
- Users: Get PostgreSQL credentials, manage MFA, view sessions
- Superadmin (
/admin): Manage users, configure SSO, view audit logs
Enable TLS
yaml
# config.yaml
tls:
disabled: false
cert_path: "/path/to/cert.pem"
key_path: "/path/to/key.pem"
pgwire:
tls:
enabled: true
cert_path: "/path/to/cert.pem"
key_path: "/path/to/key.pem"Docker
bash
docker run -v ./config.yaml:/app/config.yaml \
-p 443:443 -p 5432:5432 -p 50051:50051 -p 50250:50250 \
-e SERVER_IP_ADDRESS=1.2.3.4 \
boilinginsights/boilstream:aarch64-linux-0.8.0Next Steps
- DuckLake Integration - Hot/cold tier architecture
- Streaming DuckLakes - Materialized views
- Authentication - SSO and MFA setup
- boilstream-admin CLI - Cluster management
- Multi-Tenancy - Tenant isolation