Skip to content

AWS SSO (SAML) Authentication for PostgreSQL Interface

AWS Single Sign-On (SSO) provides enterprise SAML 2.0 authentication specifically for BoilStream's PostgreSQL wire protocol interface. This guide covers configuring SAML Service Provider (SP) endpoints to enable AWS SSO integration for PostgreSQL clients.

Overview

AWS SSO SAML integration provides PostgreSQL authentication with:

  • PostgreSQL Wire Protocol Authentication: SAML-based login for psql, DBeaver, and other PostgreSQL clients
  • Automatic Credential Generation: SAML authentication generates PostgreSQL-compatible SCRAM-SHA-256 credentials
  • Group-based Database Roles: Map AWS SSO groups to PostgreSQL database permissions
  • Session Management: Secure database session handling with configurable timeouts

Important: This SAML integration is specifically for PostgreSQL interface authentication. For other interfaces (Flight, HTTP), use JWT-based providers like Azure Entra ID, AWS Cognito, or Auth0.

Authentication Methods by Interface

BoilStream supports different authentication methods for different interfaces:

InterfacePortAuthentication MethodProvidersUse Case
PostgreSQL Wire5432SAML → SCRAM-SHA-256AWS SSO (this guide)psql, DBeaver, pgAdmin, JDBC/ODBC
Flight RPC50051JWT Bearer TokenAzure Entra ID, AWS Cognito, Auth0, OktaDuckDB Airport, Arrow Flight clients
HTTP API443/8443JWT Bearer TokenAzure Entra ID, AWS Cognito, Auth0, OktaREST API, Web applications
Kafka Protocol9092SASL/PLAIN or JWTVariousKafka producers/consumers

Why SAML for PostgreSQL?

PostgreSQL clients expect username/password authentication (SCRAM-SHA-256), not JWT tokens. SAML provides:

  1. Enterprise SSO: Users authenticate with AWS SSO
  2. Credential Generation: SAML assertions generate PostgreSQL-compatible credentials
  3. Compatibility: Works with all PostgreSQL clients without modifications
  4. Security: Temporary credentials with automatic expiration

Prerequisites

  • AWS Account with AWS SSO configured
  • Administrative access to AWS SSO
  • BoilStream instance with TLS enabled
  • SP certificates for SAML signing

Authentication Flow Architecture

PostgreSQL Client Authentication via SAML

Key Differences from JWT Authentication

AspectSAML (AWS SSO)JWT (Azure Entra ID, Cognito)
ProtocolPostgreSQL Wire ProtocolFlight RPC, HTTP API
Client TypePostgreSQL clients (psql, DBeaver)DuckDB, Arrow Flight clients
AuthenticationSCRAM-SHA-256 with generated credentialsBearer token in headers
Credential FlowBrowser-based SAML → Database credentialsDirect token validation
Use CaseInteractive database accessProgrammatic API access

BoilStream Configuration

1. Generate SP Certificates

First, generate certificates for your Service Provider:

bash
# Run the certificate generation script
./scripts/generate_saml_certificates.sh

# This creates in test-saml-certs/:
# - sp-cert.pem (SP certificate)
# - sp-key.pem (SP private key)
# - sp-cert.der (DER format)
# - sp-key.der (DER format)

2. Configure SAML in BoilStream

Update your configuration file (local-dev.yaml or default-config.yaml):

yaml
# Authentication configuration
auth:
  providers: ["saml"] # Add SAML to providers list
  authorization_enabled: true
  admin_groups: ["AWS-SSO-Admins"]
  read_only_groups: ["AWS-SSO-ReadOnly"]

# OAuth providers configuration
oauth_providers:
  # SAML providers (multi-provider array format)
  saml:
    - name: "aws-sso"
      enabled: true
      sp_entity_id: "https://your-app.example.com"
      sp_acs_url: "https://your-app.example.com/auth/saml/acs"
      sp_slo_url: "https://your-app.example.com/auth/saml/logout"

      idp_entity_id: "arn:aws:siam::123456789012:saml-provider/YourProvider"
      idp_sso_url: "https://portal.sso.us-east-1.amazonaws.com/saml/assertion/MDAwMjk2ODAtZDA2NS00NGRj"
      idp_certificate: |
        -----BEGIN CERTIFICATE-----
        MIIDpDCCAoygAwIBAgIJAKnL4UEDMN...
        -----END CERTIFICATE-----

      # SP Certificate and Key - can use file paths or inline content
      sp_certificate: "./test-saml-certs/sp-cert.pem"  # File path
      sp_private_key: "./test-saml-certs/sp-key.pem"   # File path

      # Or use inline content:
      # sp_certificate: |
      #   -----BEGIN CERTIFICATE-----
      #   Your SP certificate content here
      #   -----END CERTIFICATE-----
      # sp_private_key: |
      #   -----BEGIN PRIVATE KEY-----
      #   Your SP private key content here
      #   -----END PRIVATE KEY-----

      attribute_mappings:
        email: "https://aws.amazon.com/SAML/Attributes/RoleSessionName"
        username: "https://aws.amazon.com/SAML/Attributes/PrincipalTag:Email"
        groups: "https://aws.amazon.com/SAML/Attributes/Role"

3. Environment Variables

Alternatively, configure via environment variables:

bash
# Enable SAML authentication
export AUTH_PROVIDERS="saml"
export SAML_ENABLED="true"

# Service Provider configuration
export SAML_SP_ENTITY_ID="https://your-app.example.com"
export SAML_SP_ACS_URL="https://your-app.example.com/auth/saml/acs"
export SAML_SP_SLO_URL="https://your-app.example.com/auth/saml/logout"
export SAML_SP_METADATA_URL="https://your-app.example.com/auth/saml/metadata"

# Identity Provider configuration
export SAML_IDP_ENTITY_ID="arn:aws:siam::123456789012:saml-provider/YourProvider"
export SAML_IDP_SSO_URL="https://portal.sso.us-east-1.amazonaws.com/saml/assertion/..."
export SAML_IDP_CERTIFICATE="-----BEGIN CERTIFICATE-----..."

# Certificate paths
export SAML_SP_CERTIFICATE_PATH="./test-saml-certs/sp-cert.pem"
export SAML_SP_PRIVATE_KEY_PATH="./test-saml-certs/sp-key.pem"

# Authorization groups
export ADMIN_GROUPS="AWS-SSO-Admins"
export WRITE_GROUPS="AWS-SSO-DataProducers"
export READ_ONLY_GROUPS="AWS-SSO-DataAnalysts"

AWS SSO Configuration

1. Create SAML Application

  1. Navigate to AWS SSO Console
  2. Go to ApplicationsAdd a new application
  3. Select Add a custom SAML 2.0 application
  4. Configure application:
    • Display name: BoilStream
    • Description: BoilStream Data Ingestion Platform

2. Configure SAML Settings

In the Application metadata section:

xml
<!-- Upload or paste BoilStream SP metadata from -->
<!-- https://your-app.example.com/auth/saml/metadata -->

<EntityDescriptor entityID="https://your-app.example.com">
  <SPSSODescriptor protocolSupportEnumeration="urn:oasis:names:tc:SAML:2.0:protocol">
    <KeyDescriptor use="signing">
      <KeyInfo>
        <X509Data>
          <X509Certificate>Your SP Certificate</X509Certificate>
        </X509Data>
      </KeyInfo>
    </KeyDescriptor>
    <SingleLogoutService
      Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST"
      Location="https://your-app.example.com/auth/saml/logout"/>
    <AssertionConsumerService
      Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST"
      Location="https://your-app.example.com/auth/saml/acs"
      index="0"
      isDefault="true"/>
  </SPSSODescriptor>
</EntityDescriptor>

3. Configure Attribute Mappings

Map AWS SSO attributes to SAML assertions:

Application AttributeAWS SSO AttributeFormat
Subject$emailAddress
email$unspecified
username$unspecified
firstName$unspecified
lastName$unspecified
memberOf$unspecified

4. Download IdP Metadata

  1. In AWS SSO application settings, click View instructions
  2. Download the IdP metadata XML file
  3. Extract the IdP certificate and SSO URL from metadata
  4. Update BoilStream configuration with these values

5. Assign Users and Groups

  1. Go to Assigned users tab
  2. Click Assign users
  3. Select users or groups to grant access
  4. Configure permission sets as needed

SAML Endpoints

BoilStream provides the following SAML endpoints:

Service Provider Metadata

GET /auth/saml/metadata

Returns SP metadata XML for IdP configuration.

SSO Initiation

GET /auth/saml/login?RelayState=/dashboard

Initiates SAML authentication flow with optional RelayState.

Assertion Consumer Service (ACS)

POST /auth/saml/acs

Receives and processes SAML responses from IdP.

Single Logout

POST /auth/saml/logout

Handles SAML logout requests and responses.

Client Integration

PostgreSQL Client Connection Flow

  1. Initial Connection Attempt:
bash
# Attempt to connect to BoilStream PostgreSQL interface
psql -h localhost -p 5432 -U your_email@example.com -d boilstream

# Server responds with SAML login URL
# ERROR: Please authenticate at: https://your-app.example.com/auth/saml/login?session=abc123
  1. Browser-Based SAML Authentication:
javascript
// User opens the provided URL in browser
// After successful AWS SSO login, credentials are displayed:
// Username: saml_user_abc123
// Password: generated_password_xyz789
// Valid until: 2024-01-20 15:30:00 UTC
  1. Connect with Generated Credentials:
bash
# Use the SAML-generated credentials
psql -h localhost -p 5432 -U saml_user_abc123 -d boilstream
# Enter password: generated_password_xyz789

# Now connected to BoilStream via PostgreSQL protocol
boilstream=> SELECT * FROM s3.events LIMIT 10;

DBeaver Configuration

  1. Create new PostgreSQL connection
  2. Set connection parameters:
    • Host: localhost
    • Port: 5432
    • Database: boilstream
  3. On first connection, follow SAML login URL
  4. Enter generated credentials in DBeaver

Programmatic Access (Python)

python
import psycopg2
import webbrowser

def connect_with_saml():
    try:
        # Initial connection attempt
        conn = psycopg2.connect(
            host="localhost",
            port=5432,
            database="boilstream",
            user="your_email@example.com"
        )
    except psycopg2.OperationalError as e:
        # Extract SAML login URL from error message
        if "authenticate at:" in str(e):
            login_url = str(e).split("authenticate at: ")[1]
            print(f"Please login at: {login_url}")
            webbrowser.open(login_url)

            # Wait for user to complete SAML login
            username = input("Enter generated username: ")
            password = input("Enter generated password: ")

            # Connect with SAML-generated credentials
            conn = psycopg2.connect(
                host="localhost",
                port=5432,
                database="boilstream",
                user=username,
                password=password
            )
            return conn

Important: DuckDB Uses JWT Authentication

For DuckDB Airport extension connections, use JWT-based authentication (not SAML):

sql
-- DuckDB uses JWT tokens from Azure Entra ID, Cognito, etc.
CREATE SECRET boilstream_jwt (
    type airport,
    auth_token 'eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...',  -- JWT token
    scope 'grpc+tls://localhost:50051/'
);

-- This is NOT using SAML authentication
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc+tls://localhost:50051/');

Authorization Mapping

PostgreSQL Database Roles

AWS SSO groups are mapped to PostgreSQL database roles in BoilStream:

yaml
# AWS SSO Group → PostgreSQL Role → Permissions
AWS-SSO-Admins → pg_admin → CREATE, DROP, ALTER, SELECT, INSERT, UPDATE, DELETE
AWS-SSO-DataEngineers → pg_write → SELECT, INSERT, UPDATE, DELETE
AWS-SSO-DataAnalysts → pg_read → SELECT only

Database User Creation

When a user authenticates via SAML:

  1. User Creation: A PostgreSQL user is created (e.g., saml_user_abc123)
  2. Role Assignment: User is granted roles based on AWS SSO group membership
  3. Schema Access: User gets access to relevant schemas (s3, kafka, http)
  4. Credential Storage: SCRAM-SHA-256 hash stored in OAuth store

Custom Attributes

Add custom attributes in AWS SSO for fine-grained control:

yaml
# Custom attribute mappings
department: ${user:department}
team: ${path:enterprise.division}
role: ${path:enterprise.employeeType}

Security Best Practices

✅ Required Security Measures

  1. Use TLS/HTTPS: Always use encrypted connections
  2. Certificate Validation: Verify IdP certificates
  3. Signature Verification: Validate all SAML assertions
  4. Replay Protection: Track assertion IDs to prevent replay attacks
  5. Session Security: Use secure session cookies with HttpOnly and Secure flags

⚠️ Security Warnings

  • Private Key Protection: Keep SP private keys secure and never commit to version control
  • Certificate Rotation: Regularly rotate SP certificates (recommended: annually)
  • Assertion Validation: Always validate assertion timestamps and conditions
  • Logout Implementation: Implement proper Single Logout (SLO) handling

Network Security

bash
# Verify AWS SSO endpoint accessibility
curl -v "https://portal.sso.us-east-1.amazonaws.com/saml/metadata/..."

# Test SP metadata endpoint
curl "https://your-app.example.com/auth/saml/metadata"

Troubleshooting

Common Issues

"Invalid SAML Response" errors:

bash
# Check SAML response structure
export RUST_LOG="boilstream::auth::saml=debug"

# Verify certificate matching
openssl x509 -in sp-cert.pem -noout -fingerprint

"Signature verification failed" errors:

bash
# Ensure xmlsec1 library is installed
brew install libxmlsec1  # macOS
apt-get install libxmlsec1  # Ubuntu/Debian

# Verify IdP certificate is correctly configured
openssl x509 -in idp-cert.pem -noout -text

"User not authorized" errors:

bash
# Check group mappings in logs
export RUST_LOG="boilstream::auth::saml=debug,boilstream::auth::manager=debug"

# Verify SAML attributes are being received
# Check assertion for group membership claims

Debug Logging

Enable detailed SAML logging:

bash
export RUST_LOG="boilstream::auth::saml=trace,samael=debug"

Testing SAML Flow

Test the SAML authentication flow:

bash
# 1. Get SP metadata
curl https://your-app.example.com/auth/saml/metadata > sp-metadata.xml

# 2. Initiate login (captures redirect URL)
curl -v https://your-app.example.com/auth/saml/login

# 3. Test with SAML response (requires valid response)
curl -X POST https://your-app.example.com/auth/saml/acs \
  -d "SAMLResponse=base64_encoded_response" \
  -d "RelayState=/dashboard"

Advanced Configuration

Multiple SAML Providers

Configure multiple SAML identity providers:

yaml
oauth_providers:
  saml:
    - name: "aws-sso-prod"
      enabled: true
      sp_entity_id: "https://prod.example.com"
      # ... production configuration

    - name: "aws-sso-dev"
      enabled: true
      sp_entity_id: "https://dev.example.com"
      # ... development configuration

Custom Session Management

Configure session timeouts and refresh:

yaml
saml:
  # Session configuration
  session_lifetime_seconds: 28800 # 8 hours
  session_idle_timeout_seconds: 1800 # 30 minutes
  allow_session_refresh: true

  # Token generation for database access
  generate_db_credentials: true
  db_credential_lifetime_seconds: 3600 # 1 hour

Attribute Transformation

Transform SAML attributes before mapping:

yaml
saml:
  attribute_transformations:
    # Extract email from Role ARN
    email:
      source: "https://aws.amazon.com/SAML/Attributes/Role"
      regex: "arn:aws:iam::\\d+:role/([^,]+)"
      template: "${1}@example.com"

    # Map AWS roles to groups
    groups:
      source: "https://aws.amazon.com/SAML/Attributes/Role"
      mappings:
        "*/AdminRole": "admins"
        "*/DeveloperRole": "developers"
        "*/AnalystRole": "analysts"

Migration from Other Auth Providers

From AWS Cognito

  1. Export user groups from Cognito
  2. Create corresponding groups in AWS SSO
  3. Update configuration to use SAML instead of JWT
  4. Migrate user sessions gradually

From Direct OAuth

  1. Map OAuth scopes to SAML attributes
  2. Update client applications to use SAML flow
  3. Provide transition period with both auth methods

Performance Considerations

  • Metadata Caching: SP metadata is cached for 24 hours
  • Assertion Validation: Signatures are validated using native xmlsec1
  • Session Storage: Sessions stored in memory with Redis backend option
  • Connection Pooling: Reuse HTTPS connections to IdP endpoints

Monitoring and Metrics

Monitor SAML authentication metrics:

bash
# Prometheus metrics endpoint
curl http://localhost:8081/metrics | grep saml

# Key metrics:
# - saml_login_attempts_total
# - saml_login_success_total
# - saml_login_failures_total
# - saml_assertion_validation_duration_seconds
# - saml_session_duration_seconds

Summary

SAML vs JWT Authentication in BoilStream

FeatureSAML (AWS SSO)JWT (Azure Entra ID, etc.)
Target InterfacePostgreSQL Wire Protocol (port 5432)Flight RPC (50051), HTTP API (443)
Client Examplespsql, DBeaver, pgAdmin, TablePlusDuckDB, Apache Arrow, REST clients
Auth FlowBrowser → SAML → CredentialsDirect token in Authorization header
Credential TypeUsername/Password (SCRAM-SHA-256)Bearer JWT token
Credential LifetimeConfigurable (default 1 hour)Token expiration (typically 1 hour)
StorageOAuth store with hashed passwordsToken validation only
Best ForInteractive database accessProgrammatic API access

When to Use Each

Use SAML (this guide) when:

  • Connecting with PostgreSQL clients (psql, DBeaver, etc.)
  • Users need interactive database access
  • You want AWS SSO integration for database users
  • PostgreSQL compatibility is required

Use JWT authentication when:

  • Using DuckDB Airport extension
  • Building applications with Flight RPC
  • Calling HTTP REST APIs
  • Using Azure Entra ID, AWS Cognito, or Auth0

Next Steps