Deployment guide

Production deployment patterns and recommendations.

SDK in production

Set timeout to 30s or higher for LLM calls
Enable retry logic (SDK v2 does this by default with exponential backoff)
Use connection pooling — create one Client instance per process
Cache prompt definitions (they rarely change) — SDK does this automatically for 5 minutes

PH_API_KEY=...
PH_BASE_URL=https://api.mlpipeline-cloud.com/v1
PH_TIMEOUT_SECONDS=30
PH_MAX_RETRIES=3

Enable request logging:

client = Client(api_key="...", debug=True)
# Logs to stderr with trace IDs

Forward metrics to Datadog:

from promptlayer_hub.integrations import DatadogMiddleware
client.add_middleware(DatadogMiddleware(statsd_host="localhost"))

Enterprise plan: multi-region deployment with automatic failover, RPO < 1 min, RTO < 5 min. Contact sales for details.