Skip to content

Okta Authentication

Okta provides comprehensive identity management and secure authentication for BoilStream. This guide covers integrating with Okta's Custom Authorization Servers and Org Authorization Server for JWT authentication.

Overview

Okta integration provides:

  • Flexible Authorization Servers: Support for both custom and org authorization servers
  • Enterprise SSO: Integration with Active Directory, LDAP, and social providers
  • Advanced Group Management: Role-based access control with custom groups
  • API Access Management: OAuth 2.0 scopes for fine-grained permissions
  • High Availability: Okta-managed service with enterprise SLA

Prerequisites

  • Okta account (developer accounts available free)
  • Okta organization configured
  • API Access Management license (for custom authorization servers)
  • Users and groups configured in Okta

Okta Setup

1. Create Authorization Server

Create a custom authorization server for BoilStream:

bash
# Create custom authorization server
okta api create /authorizationServers \
  --data '{
    "name": "BoilStream API",
    "description": "Authorization server for BoilStream data platform",
    "audiences": ["api://boilstream"]
  }'
json
{
  "name": "BoilStream API",
  "description": "Authorization server for BoilStream data platform", 
  "audiences": ["api://boilstream"],
  "issuerMode": "CUSTOM_URL"
}

2. Configure API Scopes

Add scopes to your authorization server:

bash
# Add scopes to authorization server
AUTH_SERVER_ID="your-auth-server-id"

okta api create /authorizationServers/$AUTH_SERVER_ID/scopes \
  --data '{
    "name": "read:data",
    "description": "Read access to data streams"
  }'

okta api create /authorizationServers/$AUTH_SERVER_ID/scopes \
  --data '{
    "name": "write:data", 
    "description": "Write access to data streams"
  }'

okta api create /authorizationServers/$AUTH_SERVER_ID/scopes \
  --data '{
    "name": "admin:system",
    "description": "Administrative access to system"
  }'
json
[
  {
    "name": "read:data",
    "description": "Read access to data streams",
    "consent": "IMPLICIT"
  },
  {
    "name": "write:data",
    "description": "Write access to data streams", 
    "consent": "IMPLICIT"
  },
  {
    "name": "admin:system",
    "description": "Administrative access to system",
    "consent": "IMPLICIT"
  }
]

3. Create Application

Create an application for client authentication:

bash
# Create OIDC application
okta apps create \
  --app-name "BoilStream Client" \
  --app-type "web" \
  --redirect-uris "https://boilstream.company.com/callback" \
  --grant-types "authorization_code,client_credentials"
json
{
  "name": "BoilStream Client",
  "label": "BoilStream Data Platform",
  "signOnMode": "OPENID_CONNECT",
  "credentials": {
    "oauthClient": {
      "token_endpoint_auth_method": "client_secret_basic"
    }
  },
  "settings": {
    "oauthClient": {
      "redirect_uris": [
        "https://boilstream.company.com/callback"
      ],
      "grant_types": [
        "authorization_code",
        "client_credentials"
      ],
      "response_types": [
        "code"
      ]
    }
  }
}

4. Configure Groups

Set up groups in Okta for authorization:

bash
# Create groups via Okta API
curl -X POST https://company.okta.com/api/v1/groups \
  -H "Authorization: SSWS YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "profile": {
      "name": "DataEngineers",
      "description": "Data engineering team with full access"
    }
  }'

curl -X POST https://company.okta.com/api/v1/groups \
  -H "Authorization: SSWS YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "profile": {
      "name": "DataAnalysts", 
      "description": "Data analysts with read-only access"
    }
  }'

5. Setup Claims (Optional)

Configure custom claims to include groups in tokens:

json
{
  "name": "groups",
  "claimType": "RESOURCE",
  "valueType": "GROUPS",
  "value": ".*",
  "conditions": {
    "scopes": ["read:data", "write:data", "admin:system"]
  }
}
json
{
  "name": "groups", 
  "claimType": "IDENTITY",
  "valueType": "GROUPS",
  "value": ".*",
  "conditions": {
    "scopes": ["openid", "groups"]
  }
}

BoilStream Configuration

Environment Variables

Configure BoilStream to use Okta:

bash
# Required Okta settings
export AUTH_PROVIDERS="okta"
export OKTA_ORG_DOMAIN="company.okta.com"
export OKTA_AUDIENCE="api://boilstream"

# Optional: Authorization server configuration
export OKTA_AUTH_SERVER_ID="default"  # or custom server ID
export OKTA_CLIENT_ID="0oa123456789abcdef"  # For ID token validation
export OKTA_ALLOW_ORG_SERVER="false"  # Allow org server tokens

# Authorization rules (optional)
export ADMIN_GROUPS="DataEngineers,SystemAdmins"
export READ_ONLY_GROUPS="DataAnalysts,Viewers"
export WRITE_GROUPS="DataEngineers,ETLDevelopers"

Docker Compose Example

yaml
version: '3.8'
services:
  boilstream:
    image: boilstream/ingestion-agent
    environment:
      # Okta Configuration
      AUTH_PROVIDERS: "okta"
      OKTA_ORG_DOMAIN: "company.okta.com"
      OKTA_AUDIENCE: "api://boilstream"
      OKTA_AUTH_SERVER_ID: "default"
      OKTA_CLIENT_ID: "0oa123456789abcdef"
      OKTA_ALLOW_ORG_SERVER: "false"
      
      # Authorization
      ADMIN_GROUPS: "DataEngineers,SystemAdmins"
      READ_ONLY_GROUPS: "DataAnalysts"
      WRITE_GROUPS: "DataEngineers,ETLDevelopers"
      
      # Other configuration...
      S3_BUCKET: "my-data-bucket"
    ports:
      - "50051:50051"

Client Integration

JavaScript/Node.js

javascript
import { OktaAuth } from '@okta/okta-auth-js';

// Initialize Okta client
const oktaAuth = new OktaAuth({
  issuer: 'https://company.okta.com/oauth2/default',
  clientId: '0oa123456789abcdef',
  redirectUri: window.location.origin + '/callback',
  scopes: ['openid', 'read:data', 'write:data'],
  pkce: true
});

// Get access token
const tokenManager = oktaAuth.getTokenManager();
const accessToken = await tokenManager.get('accessToken');

// Use with BoilStream
const headers = {
  'Authorization': `Bearer ${accessToken.accessToken}`
};

Python

python
import requests
from okta import config
from okta import UsersClient

# Okta configuration
OKTA_ORG_URL = 'https://company.okta.com'
CLIENT_ID = '0oa123456789abcdef'
CLIENT_SECRET = 'your-client-secret'
AUDIENCE = 'api://boilstream'

# Get access token using client credentials
token_url = f'{OKTA_ORG_URL}/oauth2/default/v1/token'
token_data = {
    'grant_type': 'client_credentials',
    'scope': 'read:data write:data',
    'client_id': CLIENT_ID,
    'client_secret': CLIENT_SECRET
}

response = requests.post(token_url, data=token_data)
token = response.json()['access_token']

# Use with BoilStream
headers = {
    'Authorization': f'Bearer {token}'
}

curl Examples

bash
# Get access token
TOKEN=$(curl -X POST https://company.okta.com/oauth2/default/v1/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials" \
  -d "scope=read:data write:data" \
  -d "client_id=0oa123456789abcdef" \
  -d "client_secret=your-client-secret" | jq -r '.access_token')

# Use with BoilStream
curl -H "Authorization: Bearer $TOKEN" \
     https://localhost:50051/flight/health

DuckDB with Authentication

sql
-- Set up authentication headers
SET VARIABLE auth_header = 'Bearer YOUR_JWT_TOKEN_HERE';

-- Connect with authentication
ATTACH 'boilstream' (
    TYPE AIRPORT, 
    location 'grpc+tls://localhost:50051/',
    headers (Authorization = getvariable('auth_header'))
);

-- Use authenticated connection
CREATE TABLE boilstream.s3.user_events (
    timestamp TIMESTAMPTZ,
    user_id VARCHAR,
    event_type VARCHAR,
    properties JSON
);

Claims Mapping

Standard Claims

Okta tokens include standard JWT claims:

  • sub: User ID (e.g., 00u123456789abcdef)
  • iss: Issuer (e.g., https://company.okta.com/oauth2/default)
  • aud: Audience (API identifier or client ID)
  • exp: Expiration timestamp
  • iat: Issued at timestamp
  • scp: Scopes array (unique to Okta)

Groups Claims

Configure groups claims for authorization:

json
{
  "sub": "00u123456789abcdef",
  "iss": "https://company.okta.com/oauth2/default",
  "aud": "api://boilstream",
  "scp": ["read:data", "write:data"],
  "groups": [
    "DataEngineers",
    "PlatformUsers"
  ]
}

Authorization Server Types

Custom Authorization Server

  • Issuer: https://company.okta.com/oauth2/{authServerId}
  • JWKS: https://company.okta.com/oauth2/{authServerId}/v1/keys
  • Audience: Custom API identifier
  • Scopes: Custom defined scopes

Org Authorization Server

  • Issuer: https://company.okta.com
  • JWKS: https://company.okta.com/oauth2/v1/keys
  • Audience: Client ID
  • Scopes: Standard OIDC scopes

Security Considerations

Token Validation

BoilStream validates Okta tokens by:

  1. Signature Verification: RSA signature validation using Okta JWKS
  2. Issuer Validation: Ensures token comes from configured Okta org
  3. Audience Validation: Verifies token intended for BoilStream API
  4. Expiration Check: Ensures token hasn't expired

Best Practices

  • Use HTTPS: Always use TLS for token transmission
  • Short-lived Tokens: Configure reasonable token expiration times
  • Principle of Least Privilege: Grant minimal required scopes and groups
  • Regular Rotation: Rotate client secrets regularly
  • Monitor Access: Use Okta system logs to monitor authentication events

JWKS Caching

BoilStream caches Okta JWKS keys for performance:

  • Cache Duration: 1 hour (3600 seconds)
  • Max Keys: 100 keys cached
  • Automatic Refresh: Keys refreshed on cache miss
  • Multi-Server Support: Supports both custom and org server keys

Testing

Test JWT Generation

For testing, you can generate test tokens:

bash
# Get test token using Okta CLI
okta jwt create \
    --aud "api://boilstream" \
    --scope "read:data write:data" \
    --groups "DataEngineers,PlatformUsers"

# Or use curl with client credentials
curl -X POST https://company.okta.com/oauth2/default/v1/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials" \
  -d "scope=read:data write:data" \
  -d "client_id=0oa123456789abcdef" \
  -d "client_secret=your-client-secret"

Integration Testing

bash
# Test authentication endpoint
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" \
     https://localhost:50051/flight/health

# Test with invalid token (should return 401)
curl -H "Authorization: Bearer invalid_token" \
     https://localhost:50051/flight/health

Troubleshooting

Common Issues

  1. "Invalid audience"

    • Verify OKTA_AUDIENCE matches your API identifier
    • Check token aud claim contains correct audience
  2. "Invalid issuer"

    • Verify OKTA_ORG_DOMAIN matches your Okta domain
    • Check OKTA_AUTH_SERVER_ID is correct
    • Ensure OKTA_ALLOW_ORG_SERVER setting matches token type
  3. "Key not found"

    • Okta JWKS endpoint unreachable
    • Key rotation occurred, wait for cache refresh
  4. "Groups not extracted"

    • Verify groups claim is configured in authorization server
    • Check that groups scope is included in token request

Debug Commands

bash
# Verify Okta configuration
echo "Org Domain: $OKTA_ORG_DOMAIN"
echo "Auth Server: $OKTA_AUTH_SERVER_ID" 
echo "Audience: $OKTA_AUDIENCE"
echo "Allow Org Server: $OKTA_ALLOW_ORG_SERVER"

# Test JWKS endpoints
curl https://$OKTA_ORG_DOMAIN/oauth2/$OKTA_AUTH_SERVER_ID/v1/keys
curl https://$OKTA_ORG_DOMAIN/oauth2/v1/keys

# Decode JWT token (base64)
echo "YOUR_JWT_TOKEN" | cut -d'.' -f2 | base64 -d | jq

Log Analysis

Look for these log messages:

INFO  boilstream::auth::manager: Added Okta authentication provider
DEBUG boilstream::auth::okta: Fetching JWKS from Okta: https://company.okta.com/oauth2/default/v1/keys
DEBUG boilstream::auth::okta: Successfully validated Okta token for user: 00u123456789abcdef

Advanced Configuration

Multiple Authorization Servers

Support multiple authorization servers:

bash
# Primary custom authorization server
export OKTA_AUTH_SERVER_ID="boilstream-api"

# Allow org server for backward compatibility
export OKTA_ALLOW_ORG_SERVER="true"

Custom Claims

Configure custom claims in authorization server:

json
{
  "name": "department",
  "claimType": "RESOURCE",
  "valueType": "EXPRESSION", 
  "value": "user.department",
  "conditions": {
    "scopes": ["read:data"]
  }
}

Group Filtering

Filter specific groups for BoilStream:

json
{
  "name": "groups",
  "claimType": "RESOURCE",
  "valueType": "GROUPS",
  "value": "DataEngineers|DataAnalysts|SystemAdmins",
  "conditions": {
    "scopes": ["read:data", "write:data"]
  }
}

Migration from Other Providers

From Auth0

Okta can complement or replace Auth0:

bash
# Multi-provider setup
export AUTH_PROVIDERS="auth0,okta"

# Okta-only setup
export AUTH_PROVIDERS="okta"
export OKTA_ORG_DOMAIN="company.okta.com"
export OKTA_AUDIENCE="api://boilstream"

From Azure AD

Similar migration process:

bash
# Migrate from Azure AD
export AUTH_PROVIDERS="okta"  # Replace "azure-ad"
export OKTA_ORG_DOMAIN="company.okta.com"
export OKTA_AUDIENCE="api://boilstream"

Performance Optimization

Token Caching

Configure token caching for better performance:

javascript
// Client-side token caching
const tokenManager = oktaAuth.getTokenManager();
tokenManager.setStorageManager(customStorage);

Connection Pooling

For high-throughput applications:

python
# Use connection pooling
session = requests.Session()
adapter = requests.adapters.HTTPAdapter(pool_connections=10, pool_maxsize=20)
session.mount('https://', adapter)

This comprehensive setup ensures secure, scalable authentication with Okta's enterprise identity platform.