Skip to content

Ingestion Interfaces

BoilStream provides multiple high-performance interfaces for data ingestion and querying, each optimized for specific use cases and client types.

Interface Overview

BoilStream supports five different interfaces for maximum flexibility:

Data Ingestion Interfaces

FlightRPC API

High-throughput data streaming from analytical databases

  • Protocol: Apache Arrow Flight (gRPC/HTTP/2)
  • Primary clients: DuckDB, PyArrow, custom Flight clients
  • Performance: 2.5+ GB/s throughput tested
  • Use case: Bulk data ingestion from data processing systems

HTTP/2 Arrow API

Browser-optimized real-time data collection

  • Protocol: HTTP/2 with TLS and Arrow IPC format
  • Primary clients: Web browsers, JavaScript/TypeScript applications
  • Performance: 40,000+ concurrent connections, 2+ GB/s throughput
  • Use case: Real-time event collection from web applications

Kafka Protocol Interface

Drop-in replacement for Apache Kafka producers

  • Protocol: Kafka wire protocol
  • Primary clients: Existing Kafka producer applications
  • Format support: Confluent Avro with Schema Registry
  • Use case: Seamless migration from Kafka infrastructure

Data Query Interfaces

PostgreSQL Interface

Full PostgreSQL protocol for BI tools and SQL clients

  • Protocol: PostgreSQL wire protocol (except COPY)
  • Primary clients: Power BI, Tableau, DBeaver, psql
  • Features: Complete catalog support, prepared statements, web-based authentication
  • Use case: Enterprise BI tool integration, SQL-based management
  • Authentication: Web UI OAuth login or superadmin access

FlightSQL API

SQL queries over Arrow Flight protocol

  • Protocol: Arrow FlightSQL
  • Primary clients: FlightSQL-compatible BI tools, analytical applications
  • Features: Metadata discovery, prepared statements
  • Use case: High-performance analytical queries

Choosing the Right Interface

For Data Ingestion

ScenarioRecommended InterfaceWhy
Streaming from DuckDBFlightRPCNative DuckDB Airport extension support
Web application eventsHTTP/2 ArrowBrowser-optimized with Flechette library
Migrating from KafkaKafka ProtocolNo code changes needed
Custom applicationsFlightRPC or HTTP/2Depends on client environment

For Data Querying

ScenarioRecommended InterfaceWhy
Power BI / TableauPostgreSQLFull catalog and type system support
SQL management (DDL)PostgreSQLDBeaver recommended for topic management
Analytical queriesFlightSQLOptimized for Arrow data transfer
Interactive explorationPostgreSQLFamiliar SQL client tools

Performance Comparison

InterfaceConcurrent ConnectionsThroughputLatency
FlightRPC10,000+2.5+ GB/sSub-second
HTTP/2 Arrow40,000+2+ GB/sLow
Kafka Protocol1,000+HighLow
PostgreSQL100+Query-dependentInteractive
FlightSQL1,000+HighLow

Security Features

All interfaces support enterprise security:

  • TLS Encryption: Available for all interfaces (Pro version)
  • Authentication: JWT tokens, username/password
  • Authorization: Role-based access control
  • Token Types:
    • JWT Bearer tokens (FlightRPC, FlightSQL)
    • BLAKE3 HMAC tokens (HTTP/2 Arrow)
    • Username/password (PostgreSQL, Kafka)

Quick Start Examples

Ingest with DuckDB (FlightRPC)

sql
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc://localhost:50051/');
INSERT INTO boilstream.main.my_topic SELECT * FROM local_table;

Query with PostgreSQL

sql
# Connect with credentials from web UI (https://localhost/auth)
psql -h localhost -p 5432 -U your.email@company.com -d boilstream
SELECT * FROM my_topic WHERE timestamp > NOW() - INTERVAL '1 hour';

Getting PostgreSQL Credentials

See PostgreSQL Web Authentication to setup OAuth login and get temporary PostgreSQL credentials.

For superadmin access: psql -h localhost -p 5432 -U boilstream

Stream from Browser (HTTP/2 Arrow)

javascript
import { tableToIPC } from '@uwdata/flechette-js';

const data = [{ timestamp: Date.now(), event: 'click' }];
const arrowBuffer = tableToIPC(data);

fetch('https://localhost:8443/ingest/YOUR_TOKEN', {
  method: 'POST',
  headers: { 'Content-Type': 'application/vnd.apache.arrow.stream' },
  body: arrowBuffer
});

Configuration

All interfaces are configured through the config.yaml file:

yaml
# FlightRPC
server:
  flight_base_port: 50051
  admin_flight_port: 50160
  consumer_flight_port: 50250

# HTTP/2 Arrow
http_ingestion:
  enabled: true
  port: 8443

# Kafka Protocol
kafka:
  enabled: true
  port: 9092

# PostgreSQL
pgwire:
  enabled: true
  port: 5432

Next Steps