Google Cloud Platform Authentication
Google Cloud Platform provides enterprise identity and access management for BoilStream through Google Cloud Identity, Workspace, and Identity and Access Management (IAM).
Overview
GCP integration provides:
- Google Workspace SSO: Integration with Google Workspace for enterprise users
- Cloud Identity: Centralized identity management for GCP resources
- Service Token Support: Both user identity tokens and STS (Security Token Service) tokens
- Custom Claims: Flexible group and role mapping through custom JWT claims
- Workspace Domain Validation: Restrict access to specific Google Workspace domains
Prerequisites
- Google Cloud Project with appropriate permissions
- Google Workspace domain (for enterprise users) or Cloud Identity setup
- OAuth 2.0 credentials configured in Google Cloud Console
- Users and groups configured in Google Workspace or Cloud Identity
Google Cloud Setup
1. Create OAuth 2.0 Credentials
Configure OAuth credentials in Google Cloud Console:
# Enable required APIs
gcloud services enable iamcredentials.googleapis.com
gcloud services enable cloudidentity.googleapis.com
# Create OAuth 2.0 client (requires manual setup in console)
echo "Visit Google Cloud Console to create OAuth 2.0 credentials:"
echo "https://console.cloud.google.com/apis/credentials"
{
"type": "web_application",
"name": "BoilStream Data Platform",
"authorized_redirect_uris": [
"https://boilstream.company.com/auth/callback"
],
"authorized_javascript_origins": [
"https://boilstream.company.com"
]
}
Manual Steps in Google Cloud Console:
- Navigate to APIs & Services → Credentials
- Click "Create Credentials" → "OAuth 2.0 Client ID"
- Select "Web application"
- Add authorized redirect URIs for your BoilStream deployment
- Note the Client ID for configuration
2. Configure Workspace Groups
Create groups in Google Workspace Admin Console:
# Groups to create in Google Workspace:
# - boilstream-admins@company.com
# - data-producers@company.com
# - data-analysts@company.com
Google Workspace Admin Console Steps:
- Go to admin.google.com
- Navigate to Directory → Groups
- Create groups for BoilStream authorization
- Add users to appropriate groups
3. Service Account (Optional)
For service-to-service authentication, create a service account:
# Create service account
gcloud iam service-accounts create boilstream-service \
--description="BoilStream service account" \
--display-name="BoilStream Service"
# Create and download key
gcloud iam service-accounts keys create boilstream-key.json \
--iam-account=boilstream-service@PROJECT_ID.iam.gserviceaccount.com
BoilStream Configuration
Environment Variables
Configure BoilStream for GCP authentication:
# Enable GCP authentication
export AUTH_PROVIDERS="gcp"
# GCP configuration (at least one required)
export GCP_CLIENT_ID="123456789-abc.apps.googleusercontent.com"
export GCP_PROJECT_ID="my-project-123"
# Optional: Workspace domain restriction
export GCP_REQUIRE_WORKSPACE_DOMAIN="company.com"
# Optional: Allow STS tokens (default: true)
export GCP_ALLOW_STS_TOKENS="true"
# Optional: Custom claim mapping
export GCP_GROUPS_CLAIM="groups"
export GCP_ROLES_CLAIM="roles"
# Authorization groups (use email addresses or custom claims)
export ADMIN_GROUPS="boilstream-admins@company.com"
export WRITE_GROUPS="data-producers@company.com,etl-services"
export READ_ONLY_GROUPS="data-analysts@company.com,business-users"
Minimal Configuration
BoilStream requires either GCP_CLIENT_ID
or GCP_PROJECT_ID
:
# Option 1: Google identity tokens only
export AUTH_PROVIDERS="gcp"
export GCP_CLIENT_ID="123456789-abc.apps.googleusercontent.com"
export GCP_REQUIRE_WORKSPACE_DOMAIN="company.com"
# Option 2: STS tokens only
export AUTH_PROVIDERS="gcp"
export GCP_PROJECT_ID="my-project-123"
export GCP_ALLOW_STS_TOKENS="true"
# Option 3: Both (recommended)
export AUTH_PROVIDERS="gcp"
export GCP_CLIENT_ID="123456789-abc.apps.googleusercontent.com"
export GCP_PROJECT_ID="my-project-123"
Docker Compose Example
version: '3.8'
services:
boilstream:
image: boilstream:latest
environment:
# GCP Authentication
AUTH_PROVIDERS: "gcp"
GCP_CLIENT_ID: "123456789-abc.apps.googleusercontent.com"
GCP_PROJECT_ID: "my-project-123"
GCP_REQUIRE_WORKSPACE_DOMAIN: "company.com"
GCP_ALLOW_STS_TOKENS: "true"
# Custom claim mapping
GCP_GROUPS_CLAIM: "groups"
GCP_ROLES_CLAIM: "roles"
# Authorization
ADMIN_GROUPS: "boilstream-admins@company.com"
WRITE_GROUPS: "data-producers@company.com"
READ_ONLY_GROUPS: "data-analysts@company.com"
# Other BoilStream config...
S3_BUCKET: "my-data-lake"
AWS_REGION: "us-east-1"
JWT Token Claims
BoilStream supports two types of GCP tokens:
Google Identity Tokens (accounts.google.com)
Standard Google identity tokens from OAuth flows:
{
"sub": "123456789012345678901",
"iss": "https://accounts.google.com",
"aud": "123456789-abc.apps.googleusercontent.com",
"exp": 1735689600,
"iat": 1735686000,
"email": "admin@company.com",
"email_verified": true,
"name": "John Admin",
"hd": "company.com",
"groups": [
"boilstream-admins@company.com",
"data-engineers@company.com"
]
}
Google STS Tokens (sts.googleapis.com)
Service Token Service tokens for service accounts:
{
"sub": "123456789012345678901",
"iss": "https://sts.googleapis.com",
"aud": "//iam.googleapis.com/projects/123456789/locations/global/workloadIdentityPools/my-pool/providers/my-provider",
"exp": 1735689600,
"iat": 1735686000,
"google": {
"compute_engine": {
"project_id": "my-project-123",
"zone": "us-central1-a",
"instance_id": "1234567890123456789"
}
}
}
Custom Claims
BoilStream supports configurable custom claims:
# Configure custom claim names
export GCP_GROUPS_CLAIM="company_groups"
export GCP_ROLES_CLAIM="company_roles"
{
"sub": "123456789012345678901",
"iss": "https://accounts.google.com",
"aud": "123456789-abc.apps.googleusercontent.com",
"email": "admin@company.com",
"company_groups": [
"platform-admins",
"data-engineers"
],
"company_roles": [
"BoilStreamAdmin",
"DataPlatformUser"
]
}
Authorization Mapping
BoilStream maps GCP claims to authorization context:
Google Workspace Integration
# Workspace groups -> BoilStream authorization
groups: ["boilstream-admins@company.com"] -> Admin privileges
groups: ["data-producers@company.com"] -> Write access
groups: ["data-analysts@company.com"] -> Read access
Custom Claims Mapping
# Custom claims -> BoilStream authorization
company_groups: ["platform-admins"] -> Admin privileges
company_roles: ["DataProducer"] -> Write access
Domain Validation
When GCP_REQUIRE_WORKSPACE_DOMAIN
is set, BoilStream validates the hd
(hosted domain) claim:
{
"hd": "company.com", // Must match GCP_REQUIRE_WORKSPACE_DOMAIN
"email": "user@company.com"
}
Client Integration
Getting JWT Tokens
Use Google Auth libraries to obtain JWT tokens:
from google.auth.transport.requests import Request
from google.oauth2 import id_token
import google.auth
# Option 1: Service account
credentials, project = google.auth.default()
credentials.refresh(Request())
# Get ID token for service account
target_audience = "123456789-abc.apps.googleusercontent.com"
token = id_token.fetch_id_token(Request(), target_audience)
print(f"Bearer {token}")
# Option 2: User OAuth flow (requires web setup)
from google_auth_oauthlib.flow import Flow
flow = Flow.from_client_config(
{
"web": {
"client_id": "123456789-abc.apps.googleusercontent.com",
"client_secret": "your-secret",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token"
}
},
scopes=["openid", "email", "profile"]
)
# Complete OAuth flow to get ID token
const { GoogleAuth } = require('google-auth-library');
async function getToken() {
const auth = new GoogleAuth({
scopes: ['https://www.googleapis.com/auth/cloud-platform']
});
// Get ID token
const client = await auth.getIdTokenClient('123456789-abc.apps.googleusercontent.com');
const token = await client.idTokenProvider.fetchIdToken('123456789-abc.apps.googleusercontent.com');
console.log(`Bearer ${token}`);
}
# Get identity token for current user
gcloud auth print-identity-token \
--audiences="123456789-abc.apps.googleusercontent.com"
# Activate service account and get token
gcloud auth activate-service-account --key-file=boilstream-key.json
gcloud auth print-identity-token \
--audiences="123456789-abc.apps.googleusercontent.com"
Using Tokens with BoilStream
# Set the token
export TOKEN="eyJhbGciOiJSUzI1NiIs..."
# Use with DuckDB Airport extension
duckdb -s "
INSTALL airport FROM community;
LOAD airport;
SET custom_user_agent = 'Bearer ${TOKEN}';
ATTACH 'boilstream' (TYPE AIRPORT, location 'grpc://localhost:50051/');
"
Security Considerations
✅ Best Practices
- Workspace Domain: Always set
GCP_REQUIRE_WORKSPACE_DOMAIN
for corporate environments - Service Account Keys: Protect service account keys, use Workload Identity when possible
- Token Scope: Use minimal scopes required for functionality
- Custom Claims: Use custom claims for fine-grained authorization control
⚠️ Security Warnings
- Public Google Accounts: Without domain restrictions, any Google account can authenticate
- Token Validation: Always validate audience and issuer claims
- STS Token Risks: STS tokens may have different security properties than user tokens
Network Security
# Ensure BoilStream can reach Google JWKS endpoint
curl -v "https://www.googleapis.com/oauth2/v3/certs"
# Expected response: JSON with RSA public keys
{
"keys": [
{
"kty": "RSA",
"alg": "RS256",
"use": "sig",
"kid": "abc123...",
"n": "xyz789...",
"e": "AQAB"
}
]
}
Troubleshooting
Common Issues
"Authentication failed" errors:
# Check client ID and project configuration
gcloud projects describe "my-project-123"
# Verify JWKS endpoint accessibility
curl "https://www.googleapis.com/oauth2/v3/certs"
"Authorization denied" errors:
# Check Google Workspace group membership
# (requires admin access to Workspace)
# Verify custom claims in token
echo "$TOKEN" | cut -d'.' -f2 | base64 -d | jq .
Domain validation failures:
# Check hosted domain claim in token
echo "$TOKEN" | cut -d'.' -f2 | base64 -d | jq .hd
# Verify domain configuration
echo $GCP_REQUIRE_WORKSPACE_DOMAIN
Debug Logging
Enable detailed authentication logging:
export RUST_LOG="boilstream::auth::gcp=debug,boilstream::auth::manager=debug"
Token Inspection
# Decode Google ID token
echo "$TOKEN" | cut -d'.' -f2 | base64 -d | jq .
# Check token with Google's tokeninfo endpoint
curl "https://oauth2.googleapis.com/tokeninfo?id_token=$TOKEN"
Advanced Configuration
Workload Identity
Use Workload Identity for GKE deployments:
# Kubernetes service account annotation
apiVersion: v1
kind: ServiceAccount
metadata:
name: boilstream
annotations:
iam.gke.io/gcp-service-account: boilstream-service@PROJECT_ID.iam.gserviceaccount.com
# Bind Kubernetes and Google service accounts
gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/boilstream]" \
boilstream-service@PROJECT_ID.iam.gserviceaccount.com
Custom Token Validation
For advanced use cases, implement custom token validation:
# Configure custom issuer validation
export GCP_CUSTOM_ISSUER="https://your-custom-issuer.com"
# Use custom JWKS endpoint
export GCP_CUSTOM_JWKS_URL="https://your-domain.com/.well-known/jwks.json"
Directory API Integration
Integrate with Google Workspace Directory API for dynamic group lookup:
- Enable Directory API in Google Cloud Console
- Grant service account domain-wide delegation
- Configure custom claims to fetch groups dynamically
# Example: Fetch user groups from Directory API
from googleapiclient.discovery import build
service = build('admin', 'directory_v1', credentials=credentials)
groups = service.groups().list(domain='company.com').execute()
Google Workspace Integration
Gmail/Calendar Integration
Combine BoilStream with Google Workspace data:
-- Example: Stream calendar events
COPY (
SELECT * FROM read_json('calendar_export.json')
) TO 'boilstream.s3.calendar_events';
Google Sheets Integration
Stream Google Sheets data through BoilStream:
-- Stream Google Sheets data
COPY (
SELECT * FROM read_csv('sheets_export.csv')
) TO 'boilstream.s3.sheets_data';
Next Steps
- AWS Cognito Integration - Add AWS identity support
- Azure AD Integration - Add Microsoft identity support
- Troubleshooting Guide - Debug authentication issues