Skip to content

FlightSQL API

BoilStream implements the FlightSQL protocol for SQL-based data access, enabling BI tools and analytical applications to query data using standard SQL over the Arrow Flight protocol.

Overview

FlightSQL provides:

  • Standard SQL interface over Arrow Flight protocol
  • BI tool compatibility for tools supporting FlightSQL
  • High-performance data transfer using Apache Arrow format
  • Metadata discovery for catalogs, schemas, and tables
  • Prepared statements with parameter binding

Connection Details

Protocol: grpc or grpc+tls
Port: 50250 (consumer flight port)
Authentication: JWT Bearer token (Pro version)

Supported Operations

Query Execution

Statement Query

Execute SQL queries and retrieve results:

sql
-- Simple query
SELECT * FROM topic_12345 WHERE timestamp > NOW() - INTERVAL 1 HOUR;

-- Aggregation
SELECT COUNT(*), AVG(value) FROM topic_events;

-- Joins across topics
SELECT t1.*, t2.status 
FROM topic_orders t1 
JOIN topic_users t2 ON t1.user_id = t2.id;

Prepared Statements

Prepare and execute parameterized queries:

  1. Create prepared statement with placeholders
  2. Bind parameters with actual values
  3. Execute the prepared query
  4. Close the prepared statement when done

Metadata Discovery

Get Catalogs

Retrieve available database catalogs:

  • Returns list of database names
  • Used by BI tools for schema discovery

Get DB Schemas

List schemas within catalogs:

  • Filters by catalog name (optional)
  • Returns schema names with associated catalog

Get Tables

Discover available tables:

  • Filter by catalog, schema, table name pattern
  • Optional table type filtering
  • Returns table metadata including:
    • Catalog name
    • Schema name
    • Table name
    • Table type (TABLE, TEMPORARY TABLE, VIEW)
    • Optional serialized Arrow schema (when include_schema=true)

Get Table Types

List supported table types:

  • TABLE
  • TEMPORARY TABLE
  • VIEW

SQL Info

Query server capabilities and configuration:

  • Supported SQL features
  • Data type mappings
  • Server version information

Authentication

Pro Version

JWT Bearer token authentication:

http
Authorization: Bearer <jwt_token>

Free Version

No authentication required for the free version.

Client Examples

Python with PyArrow

python
import pyarrow.flight as flight
import pyarrow.flight.sql as flightsql

# Connect to FlightSQL server
location = flight.Location.for_grpc_tcp("localhost", 50250)
client = flightsql.FlightSqlClient(location)

# Execute a query
query = "SELECT * FROM topic_events LIMIT 100"
flight_info = client.execute(query)

# Fetch results
reader = client.do_get(flight_info.endpoints[0].ticket)
table = reader.read_all()
print(table.to_pandas())

# Prepared statement example
statement = client.prepare("SELECT * FROM topic_events WHERE value > ?")
statement.set_parameters([100])
flight_info = statement.execute()
reader = client.do_get(flight_info.endpoints[0].ticket)
results = reader.read_all()

JDBC Driver

java
import java.sql.*;
import org.apache.arrow.flight.sql.FlightSqlDriver;

// Register FlightSQL JDBC driver
Class.forName("org.apache.arrow.flight.sql.FlightSqlDriver");

// Connect
String url = "jdbc:arrow-flight-sql://localhost:50250";
Properties props = new Properties();
props.setProperty("user", "username");
props.setProperty("password", "token");

Connection conn = DriverManager.getConnection(url, props);

// Execute query
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT * FROM topic_events");

while (rs.next()) {
    System.out.println(rs.getString("event_type"));
}

Data Type Mapping

FlightSQL uses Arrow data types which map to SQL types:

SQL TypeArrow TypeNotes
VARCHARUtf8Variable-length strings
INTEGERInt3232-bit integers
BIGINTInt6464-bit integers
DOUBLEFloat64Double precision
TIMESTAMPTimestampWith timezone support
BOOLEANBoolTrue/false values
BINARYBinaryRaw bytes

Query Limitations

  • Read-only access: No INSERT/UPDATE/DELETE support via FlightSQL
  • DuckDB SQL dialect: Queries use DuckDB SQL syntax
  • Streaming results: Large result sets are streamed efficiently
  • Timeout handling: Long-running queries have configurable timeouts

Performance Considerations

Efficient Querying

  • Use filters to reduce data transfer
  • Leverage DuckDB's columnar processing
  • Results are streamed as Arrow RecordBatches
  • Zero-copy data transfer where possible

Connection Pooling

  • Reuse FlightSQL clients for multiple queries
  • Prepared statements cache query plans
  • Connection overhead is minimal with gRPC

Error Handling

Common error codes:

StatusDescription
INVALID_ARGUMENTMalformed SQL or invalid parameters
UNAUTHENTICATEDMissing or invalid authentication
PERMISSION_DENIEDInsufficient permissions for query
NOT_FOUNDTable or schema doesn't exist
INTERNALServer error during query execution

Best Practices

  1. Use prepared statements for repeated queries with different parameters
  2. Close resources properly (prepared statements, result readers)
  3. Handle schema evolution - table schemas may change over time
  4. Set appropriate timeouts for long-running analytical queries
  5. Use connection pooling in production applications

Compatibility

FlightSQL in BoilStream is compatible with:

  • Apache Arrow Flight SQL specification
  • ADBC (Arrow Database Connectivity) drivers
  • BI tools with FlightSQL connectors
  • Custom applications using Arrow Flight SQL libraries

The implementation focuses on read-only analytical queries, making it ideal for:

  • Business intelligence tools
  • Data exploration
  • Reporting applications
  • Real-time dashboards