Why you need LLM observability
When you integrate LLMs into your application, you face challenges that traditional monitoring tools don’t cover:- Cost tracking - LLM API calls are billed per token. Without visibility, costs can spiral quickly and unexpectedly.
- Debugging - When an LLM returns unexpected output, you need to see the exact prompt, parameters, and response to understand what went wrong.
- Performance - Response times vary significantly by model, prompt length, and provider. You need data to optimize.
- Reliability - API errors, rate limits, and content filter blocks happen. You need to know when and why.
What AmberTrace provides
- Zero-code integration - Add two lines of code
import+init()) and all your LLM calls are traced automatically. No decorators, no wrappers, no middleware. - Multi-provider support - Works with OpenAI, Anthropic, and Google Gemini simultaneously. One SDK covers all your providers.
- Unified trace format - All providers are normalized to a consistent format, so you can compare across models and providers.
- Web portal - View traces, filter by provider/model/status, track token usage, and monitor success rates from a web dashboard.
Supported providers
| OpenAI | Anthropic | Google Gemini |
|---|---|---|
| GPT-4, GPT-4o, GPT-3.5-turbo, etc. | Claude Opus 4.5, Sonnet 4.5, Haiku, etc. | Gemini Pro, Gemini Flash, Gemini 2.0, etc. |
How it works
- Initialize - Call
ambertrace.init()at application startup. The SDK detects your installed LLM libraries and wraps their API methods transparently. - Use your LLM SDKs normally - Make API calls as you always do. The wrapper captures request parameters, calls the original method unchanged, then captures the response or error.
- View traces in the portal - Trace data is sent to the AmberTrace backend asynchronously. Open the web portal to see every call, its parameters, response, token usage, and timing.
Design guarantees
AmberTrace is designed to be invisible to your application:- Never breaks your code - All tracing errors are caught internally and logged. Exceptions from your LLM provider are re-raised unchanged.
- Non-blocking - Traces are sent in background threads/tasks. Your LLM calls never wait for trace delivery.
- Provider isolation - If tracing fails for one provider, other providers continue working normally.
- Minimal overhead - Trace collection adds ~1-2ms per call (UUID generation + timestamp). Network I/O happens in the background.