Python SDK
opensearch-genai-observability-sdk-py instruments Python AI agent applications using standard OpenTelemetry. It configures the OTEL pipeline in one call, provides a unified observe() primitive for tracing agents and tools, and enriches spans with GenAI semantic convention attributes automatically.
- PyPI:
opensearch-genai-observability-sdk-py - Python: 3.10+
- Source: github.com/opensearch-project/genai-observability-sdk-py
Installation
Section titled “Installation”pip install opensearch-genai-observability-sdk-pyAuto-instrumentation for LLM providers is opt-in:
pip install "opensearch-genai-observability-sdk-py[openai]"pip install "opensearch-genai-observability-sdk-py[anthropic]"pip install "opensearch-genai-observability-sdk-py[bedrock]"pip install "opensearch-genai-observability-sdk-py[langchain]"pip install "opensearch-genai-observability-sdk-py[llamaindex]"pip install "opensearch-genai-observability-sdk-py[otel-instrumentors]" # all providerspip install "opensearch-genai-observability-sdk-py[opensearch]" # trace retrievalpip install "opensearch-genai-observability-sdk-py[all]" # everythingAPI overview
Section titled “API overview”The SDK exports these functions and classes. This page covers the instrumentation APIs. Evaluation APIs are documented in Evaluation & Scoring.
| Export | Purpose | Docs |
|---|---|---|
register() | Configure OTEL pipeline | This page |
observe() | Trace agents, tools, LLM calls | This page |
Op | Operation name constants | This page |
enrich() | Set GenAI attributes on active span | This page |
score() | Attach evaluation scores to traces | This page |
AWSSigV4OTLPExporter | SigV4-signed OTLP exporter | This page |
evaluate() | Run agent against dataset with scorers | Evaluation & Scoring |
Experiment | Upload pre-computed eval results | Evaluation & Scoring |
EvalScore | Scorer return type | Evaluation & Scoring |
OpenSearchTraceRetriever | Query stored traces from OpenSearch | Evaluation & Scoring |
Quick start
Section titled “Quick start”from opensearch_genai_observability_sdk_py import register, observe, Op, enrich
register(endpoint="http://localhost:4318/v1/traces", service_name="my-agent")
@observe(op=Op.EXECUTE_TOOL)def get_weather(city: str) -> dict: return {"city": city, "temp": 22, "condition": "sunny"}
@observe(op=Op.INVOKE_AGENT)def assistant(query: str) -> str: enrich(model="gpt-4o", provider="openai") data = get_weather("Paris") return f"{data['condition']}, {data['temp']}C"
result = assistant("What's the weather?")register()
Section titled “register()”Configures the OTEL tracing pipeline. Call once at startup before any tracing occurs.
from opensearch_genai_observability_sdk_py import register
register( endpoint="http://localhost:4318/v1/traces", service_name="my-app",)| Parameter | Type | Default | Description |
|---|---|---|---|
endpoint | str | Data Prepper default | OTLP endpoint URL. Reads OTEL_EXPORTER_OTLP_TRACES_ENDPOINT or OTEL_EXPORTER_OTLP_ENDPOINT if not set. |
protocol | "http" | "grpc" | inferred from URL | Force transport. grpc:// -> gRPC, grpcs:// -> gRPC+TLS, else HTTP. |
service_name | str | "default" | Attached as service.name. Reads OTEL_SERVICE_NAME. |
project_name | str | Alias for service_name. Reads OPENSEARCH_PROJECT. | |
service_version | str | Sets service.version. Reads OTEL_SERVICE_VERSION. | |
batch | bool | True | True = BatchSpanProcessor, False = SimpleSpanProcessor. |
auto_instrument | bool | True | Discover and activate installed OTel instrumentor packages. |
exporter | SpanExporter | Custom exporter. Overrides endpoint, protocol, headers. | |
set_global | bool | True | Register as the global TracerProvider. |
headers | dict | Additional HTTP headers for the OTLP exporter. |
Returns the configured TracerProvider.
Endpoint schemes
Section titled “Endpoint schemes”| URL scheme | Transport |
|---|---|
http:// or https:// | OTLP HTTP (default) |
grpc:// | OTLP gRPC, insecure |
grpcs:// | OTLP gRPC with TLS |
Examples
Section titled “Examples”# Self-hosted with Data Prepperregister(service_name="my-agent")
# OTel Collector on localhostregister(endpoint="http://localhost:4318/v1/traces", service_name="my-agent")
# gRPCregister(endpoint="grpc://localhost:4317", service_name="my-agent")
# AWS OpenSearch Ingestion with SigV4from opensearch_genai_observability_sdk_py import AWSSigV4OTLPExporterexporter = AWSSigV4OTLPExporter( endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces", service="osis", region="us-east-1",)register(service_name="my-agent", exporter=exporter)observe()
Section titled “observe()”The unified tracing primitive. Works as a decorator (sync, async, generator, async generator) and as a context manager.
Usage forms
Section titled “Usage forms”# Bare decorator - span name = function qualname@observedef my_function(): ...
# Parameterized - set operation type, name, span kind@observe(name="weather_agent", op=Op.INVOKE_AGENT)def run_agent(query: str) -> str: ...
# Context manager - for inline tracing blockswith observe("llm_call", op=Op.CHAT) as span: response = llm.chat(messages)Parameters
Section titled “Parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
name | str | function __qualname__ | Span entity name. |
op | str | Sets gen_ai.operation.name. Span name becomes "{op} {name}". | |
kind | SpanKind | INTERNAL | OTel SpanKind. |
name_from | str | Function parameter whose runtime value becomes the span name. |
Op constants
Section titled “Op constants”| Constant | Value | Use for |
|---|---|---|
Op.INVOKE_AGENT | "invoke_agent" | Agent invocations and orchestration |
Op.EXECUTE_TOOL | "execute_tool" | Tool/function calls |
Op.CHAT | "chat" | LLM chat completions |
Op.CREATE_AGENT | "create_agent" | Agent initialization |
Op.RETRIEVAL | "retrieval" | RAG retrieval |
Op.EMBEDDINGS | "embeddings" | Embedding generation |
Op.GENERATE_CONTENT | "generate_content" | Content generation |
Op.TEXT_COMPLETION | "text_completion" | Text completions |
Any custom string also works for op.
Automatic behavior
Section titled “Automatic behavior”In decorator mode, observe() automatically:
- Captures input as
gen_ai.input.messages(orgen_ai.tool.call.argumentsfor tools). Skipsself/cls. - Captures output as
gen_ai.output.messages(orgen_ai.tool.call.resultfor tools). Won’t overwrite if already set. - Records errors as span status
ERRORwith an exception event. - Sets entity attributes -
gen_ai.agent.namefor agents,gen_ai.tool.name+gen_ai.tool.type="function"for tools.
All values truncated at 10,000 characters.
Span hierarchy
Section titled “Span hierarchy”flowchart TD
A["@observe op=INVOKE_AGENT"] --> B["@observe op=CHAT - LLM call"]
A --> C["@observe op=EXECUTE_TOOL - tool call"]
C --> D["LLM call (auto-instrumented)"]
Dispatcher pattern
Section titled “Dispatcher pattern”When the tool name is only known at call time:
@observe(op=Op.EXECUTE_TOOL, name_from="tool_name")def execute_tool(self, tool_name: str, arguments: dict) -> dict: return self._tools[tool_name](**arguments)# Produces: "execute_tool web_search", "execute_tool calculator", etc.Async support
Section titled “Async support”@observe(op=Op.EXECUTE_TOOL)async def async_search(query: str) -> list: return await search_api.query(query)enrich()
Section titled “enrich()”Adds GenAI semantic convention attributes to the currently active span. Call inside @observe-decorated functions or with observe(...) blocks.
@observe(op=Op.CHAT, name="llm_call")def call_llm(messages: list) -> str: response = openai.chat.completions.create(model="gpt-4o", messages=messages) enrich( model="gpt-4o", provider="openai", input_tokens=response.usage.prompt_tokens, output_tokens=response.usage.completion_tokens, finish_reason=response.choices[0].finish_reason, ) return response.choices[0].message.contentParameter-to-attribute mapping
Section titled “Parameter-to-attribute mapping”| Parameter | OTel Attribute |
|---|---|
model | gen_ai.request.model |
provider | gen_ai.provider.name |
input_tokens | gen_ai.usage.input_tokens |
output_tokens | gen_ai.usage.output_tokens |
total_tokens | gen_ai.usage.total_tokens |
response_id | gen_ai.response.id |
finish_reason | gen_ai.response.finish_reasons |
temperature | gen_ai.request.temperature |
max_tokens | gen_ai.request.max_tokens |
session_id | gen_ai.conversation.id |
agent_id | gen_ai.agent.id |
agent_description | gen_ai.agent.description |
tool_definitions | gen_ai.tool.definitions |
system_instructions | gen_ai.system_instructions |
input_messages | gen_ai.input.messages |
output_messages | gen_ai.output.messages |
**extra | key used as-is |
All parameters are optional. Only provided values are set.
Auto-instrumentation
Section titled “Auto-instrumentation”register() discovers and activates installed instrumentor packages. Install the extra for your provider - no code changes needed.
| Provider / framework | Extra |
|---|---|
| OpenAI, OpenAI Agents | [openai] |
| Anthropic | [anthropic] |
| Amazon Bedrock | [bedrock] |
| LangChain | [langchain] |
| LlamaIndex | [llamaindex] |
| Cohere | [cohere] |
| Mistral | [mistral] |
| Groq | [groq] |
| Ollama | [ollama] |
| Google Generative AI + Vertex AI | [google] |
| All of the above + 20 more | [otel-instrumentors] |
register(auto_instrument=False) # to disablescore()
Section titled “score()”Attaches an evaluation score to a trace or span. Scores are emitted as OTEL spans through the same OTLP pipeline - no separate client or index needed.
from opensearch_genai_observability_sdk_py import score
# Score an entire tracescore(name="relevance", value=0.92, trace_id="abc123...", explanation="Response addresses the user's query")
# Score a specific spanscore(name="accuracy", value=0.95, trace_id="abc123...", span_id="def456...", label="pass")| Parameter | Type | Description |
|---|---|---|
name | str | Metric name, e.g. "relevance", "factuality". |
value | float | Numeric score. |
trace_id | str | Hex trace ID to score. Omit for standalone scores. |
span_id | str | Hex span ID for span-level scoring. |
label | str | Human-readable label, e.g. "pass". |
explanation | str | Evaluator rationale (truncated to 500 chars). |
response_id | str | LLM completion ID for correlation. |
attributes | dict | Additional span attributes. |
For running evaluations at scale (evaluate(), Experiment, OpenSearchTraceRetriever), see Evaluation & Scoring.
AWS authentication
Section titled “AWS authentication”For AWS-hosted endpoints, use AWSSigV4OTLPExporter to sign requests with SigV4:
from opensearch_genai_observability_sdk_py import AWSSigV4OTLPExporter, register
exporter = AWSSigV4OTLPExporter( endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces", service="osis", # "osis" for OSIS, "es" for OpenSearch Service region="us-east-1", # auto-detected if omitted)register(service_name="my-agent", exporter=exporter)Credentials: AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY -> ~/.aws/credentials -> IAM role/IMDS.
Environment variables
Section titled “Environment variables”| Variable | Description | Default |
|---|---|---|
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT | OTLP traces endpoint | |
OTEL_EXPORTER_OTLP_ENDPOINT | OTLP endpoint (appends /v1/traces) | Data Prepper default |
OTEL_SERVICE_NAME | Service name | "default" |
OPENSEARCH_PROJECT | Project name (fallback) | "default" |
OTEL_SERVICE_VERSION | Service version |
Related links
Section titled “Related links”- AI Observability - Getting Started - end-to-end walkthrough
- Evaluation & Scoring - score traces, run experiments
- Trace Retrieval - query stored traces from OpenSearch
- Agent Traces - viewing traces in OpenSearch Dashboards
- GenAI semantic conventions - OTel spec reference