Ingestion Interfaces
BoilStream provides multiple high-performance interfaces for data ingestion and querying, each optimized for specific use cases and client types.
Interface Overview
BoilStream supports five different interfaces for maximum flexibility:
Data Ingestion Interfaces
FlightRPC API
High-throughput data streaming from analytical databases
- Protocol: Apache Arrow Flight (gRPC/HTTP/2)
- Primary clients: DuckDB, PyArrow, custom Flight clients
- Performance: 2.5+ GB/s throughput tested
- Use case: Bulk data ingestion from data processing systems
HTTP/2 Arrow API
Browser-optimized real-time data collection
- Protocol: HTTP/2 with TLS and Arrow IPC format
- Primary clients: Web browsers, JavaScript/TypeScript applications
- Performance: 40,000+ concurrent connections, 2+ GB/s throughput
- Use case: Real-time event collection from web applications
Kafka Protocol Interface
Drop-in replacement for Apache Kafka producers
- Protocol: Kafka wire protocol
- Primary clients: Existing Kafka producer applications
- Format support: Confluent Avro with Schema Registry
- Use case: Seamless migration from Kafka infrastructure
Data Query Interfaces
PostgreSQL Interface
Full PostgreSQL protocol for BI tools and SQL clients
- Protocol: PostgreSQL wire protocol (except COPY)
- Primary clients: Power BI, Tableau, DBeaver, psql
- Features: Complete catalog support, prepared statements, web-based authentication
- Use case: Enterprise BI tool integration, SQL-based management
- Authentication: Web UI OAuth login or superadmin access
FlightSQL API
SQL queries over Arrow Flight protocol
- Protocol: Arrow FlightSQL
- Primary clients: FlightSQL-compatible BI tools, analytical applications
- Features: Metadata discovery, prepared statements
- Use case: High-performance analytical queries
Choosing the Right Interface
For Data Ingestion
Scenario | Recommended Interface | Why |
---|---|---|
Streaming from DuckDB | FlightRPC | Native DuckDB Airport extension support |
Web application events | HTTP/2 Arrow | Browser-optimized with Flechette library |
Migrating from Kafka | Kafka Protocol | No code changes needed |
Custom applications | FlightRPC or HTTP/2 | Depends on client environment |
For Data Querying
Scenario | Recommended Interface | Why |
---|---|---|
Power BI / Tableau | PostgreSQL | Full catalog and type system support |
SQL management (DDL) | PostgreSQL | DBeaver recommended for topic management |
Analytical queries | FlightSQL | Optimized for Arrow data transfer |
Interactive exploration | PostgreSQL | Familiar SQL client tools |
Performance Comparison
Interface | Concurrent Connections | Throughput | Latency |
---|---|---|---|
FlightRPC | 10,000+ | 2.5+ GB/s | Sub-second |
HTTP/2 Arrow | 40,000+ | 2+ GB/s | Low |
Kafka Protocol | 1,000+ | High | Low |
PostgreSQL | 100+ | Query-dependent | Interactive |
FlightSQL | 1,000+ | High | Low |
Security Features
All interfaces support enterprise security:
- TLS Encryption: Available for all interfaces (Pro version)
- Authentication: JWT tokens, username/password
- Authorization: Role-based access control
- Token Types:
- JWT Bearer tokens (FlightRPC, FlightSQL)
- BLAKE3 HMAC tokens (HTTP/2 Arrow)
- Username/password (PostgreSQL, Kafka)
Quick Start Examples
Ingest with DuckDB (FlightRPC)
sql
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc://localhost:50051/');
INSERT INTO boilstream.main.my_topic SELECT * FROM local_table;
Query with PostgreSQL
sql
# Connect with credentials from web UI (https://localhost/auth)
psql -h localhost -p 5432 -U your.email@company.com -d boilstream
SELECT * FROM my_topic WHERE timestamp > NOW() - INTERVAL '1 hour';
Getting PostgreSQL Credentials
See PostgreSQL Web Authentication to setup OAuth login and get temporary PostgreSQL credentials.
For superadmin access: psql -h localhost -p 5432 -U boilstream
Stream from Browser (HTTP/2 Arrow)
javascript
import { tableToIPC } from '@uwdata/flechette-js';
const data = [{ timestamp: Date.now(), event: 'click' }];
const arrowBuffer = tableToIPC(data);
fetch('https://localhost:8443/ingest/YOUR_TOKEN', {
method: 'POST',
headers: { 'Content-Type': 'application/vnd.apache.arrow.stream' },
body: arrowBuffer
});
Configuration
All interfaces are configured through the config.yaml
file:
yaml
# FlightRPC
server:
flight_base_port: 50051
admin_flight_port: 50160
consumer_flight_port: 50250
# HTTP/2 Arrow
http_ingestion:
enabled: true
port: 8443
# Kafka Protocol
kafka:
enabled: true
port: 9092
# PostgreSQL
pgwire:
enabled: true
port: 5432
Next Steps
- Detailed API Documentation: See the API Reference for complete interface specifications
- Configuration Guide: Learn about all configuration options in the Configuration Guide
- Quick Start: Get started quickly with the Getting Started Guide