Python SDK
opensearch-genai-observability-sdk-py instruments Python AI agent applications using standard OpenTelemetry. It configures the OTEL pipeline in one call, provides a unified observe() primitive for tracing agents and tools, and enriches spans with GenAI semantic convention attributes automatically.
- PyPI:
opensearch-genai-observability-sdk-py - Python: 3.10+
- Source: github.com/opensearch-project/genai-observability-sdk-py
Installation
Section titled “Installation”pip install opensearch-genai-observability-sdk-pyAuto-instrumentation for LLM providers is opt-in:
pip install "opensearch-genai-observability-sdk-py[openai]"pip install "opensearch-genai-observability-sdk-py[anthropic]"pip install "opensearch-genai-observability-sdk-py[bedrock]"pip install "opensearch-genai-observability-sdk-py[google]"pip install "opensearch-genai-observability-sdk-py[langchain]"pip install "opensearch-genai-observability-sdk-py[llamaindex]"pip install "opensearch-genai-observability-sdk-py[otel-instrumentors]" # all providerspip install "opensearch-genai-observability-sdk-py[opensearch]" # trace retrievalpip install "opensearch-genai-observability-sdk-py[all]" # everything (instrumentors + dev)API overview
Section titled “API overview”The SDK exports these functions and classes. This page covers the instrumentation APIs. Evaluation APIs are documented in Evaluation & Scoring.
| Export | Purpose | Docs |
|---|---|---|
register() | Configure OTEL pipeline | This page |
observe() | Trace agents, tools, LLM calls | This page |
Op | Operation name constants | This page |
enrich() | Set GenAI attributes on active span | This page |
score() | Attach evaluation scores to traces | This page |
AWSSigV4OTLPExporter | SigV4-signed OTLP exporter | This page |
evaluate() | Run agent against dataset with scorers | Evaluation & Scoring |
Benchmark | Upload pre-computed eval results | Evaluation & Scoring |
EvalScore | Scorer return type | Evaluation & Scoring |
BenchmarkResult | Result from evaluate() | Evaluation & Scoring |
BenchmarkSummary | Aggregate score statistics | Evaluation & Scoring |
TestCaseResult | Per-case result | Evaluation & Scoring |
ScoreSummary | Per-metric statistics | Evaluation & Scoring |
OpenSearchTraceRetriever | Query stored traces from OpenSearch | Evaluation & Scoring |
Quick start
Section titled “Quick start”from opensearch_genai_observability_sdk_py import register, observe, Op, enrich
register(endpoint="http://localhost:4318/v1/traces", service_name="my-agent")
@observe(op=Op.EXECUTE_TOOL)def get_weather(city: str) -> dict: return {"city": city, "temp": 22, "condition": "sunny"}
@observe(op=Op.INVOKE_AGENT)def assistant(query: str) -> str: enrich(model="gpt-4o", provider="openai") data = get_weather("Paris") return f"{data['condition']}, {data['temp']}C"
result = assistant("What's the weather?")register()
Section titled “register()”Configures the OTEL tracing pipeline. Call once at startup before any tracing occurs.
from opensearch_genai_observability_sdk_py import register
register( endpoint="http://localhost:4318/v1/traces", service_name="my-app",)| Parameter | Type | Default | Description |
|---|---|---|---|
endpoint | str | http://localhost:21890/opentelemetry/v1/traces | OTLP endpoint URL. Reads OTEL_EXPORTER_OTLP_TRACES_ENDPOINT or OTEL_EXPORTER_OTLP_ENDPOINT if not set. |
protocol | "http" | "grpc" | inferred from URL | Force transport. grpc:// -> gRPC, grpcs:// -> gRPC+TLS, else HTTP. |
service_name | str | "default" | Attached as service.name. Reads OTEL_SERVICE_NAME. |
project_name | str | Alias for service_name. Reads OPENSEARCH_PROJECT. | |
service_version | str | Sets service.version. Reads OTEL_SERVICE_VERSION. | |
batch | bool | True | True = BatchSpanProcessor, False = SimpleSpanProcessor. |
auto_instrument | bool | True | Discover and activate installed OTel instrumentor packages. |
exporter | SpanExporter | Custom exporter. Overrides endpoint, protocol, headers. | |
set_global | bool | True | Register as the global TracerProvider. |
headers | dict | Additional HTTP headers for the OTLP exporter. |
Returns the configured TracerProvider.
Endpoint schemes
Section titled “Endpoint schemes”| URL scheme | Transport |
|---|---|
http:// or https:// | OTLP HTTP (default) |
grpc:// | OTLP gRPC, insecure |
grpcs:// | OTLP gRPC with TLS |
Examples
Section titled “Examples”# Self-hosted with Data Prepperregister(service_name="my-agent")
# OTel Collector on localhostregister(endpoint="http://localhost:4318/v1/traces", service_name="my-agent")
# gRPCregister(endpoint="grpc://localhost:4317", service_name="my-agent")
# AWS OpenSearch Ingestion with SigV4from opensearch_genai_observability_sdk_py import AWSSigV4OTLPExporterexporter = AWSSigV4OTLPExporter( endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces", service="osis", region="us-east-1",)register(service_name="my-agent", exporter=exporter)observe()
Section titled “observe()”The unified tracing primitive. Works as a decorator (sync, async, generator, async generator) and as a context manager.
Usage forms
Section titled “Usage forms”# Bare decorator - span name = function qualname@observedef my_function(): ...
# Parameterized - set operation type, name, span kind@observe(name="weather_agent", op=Op.INVOKE_AGENT)def run_agent(query: str) -> str: ...
# Context manager - for inline tracing blockswith observe("llm_call", op=Op.CHAT) as span: response = llm.chat(messages)Parameters
Section titled “Parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
name | str | function __qualname__ | Span entity name. |
op | str | Sets gen_ai.operation.name. Span name becomes "{op} {name}". | |
kind | SpanKind | INTERNAL | OTel SpanKind. |
name_from | str | Function parameter whose runtime value becomes the span name. |
Op constants
Section titled “Op constants”| Constant | Value | Use for |
|---|---|---|
Op.INVOKE_AGENT | "invoke_agent" | Agent invocations and orchestration |
Op.EXECUTE_TOOL | "execute_tool" | Tool/function calls |
Op.CHAT | "chat" | LLM chat completions |
Op.CREATE_AGENT | "create_agent" | Agent initialization |
Op.RETRIEVAL | "retrieval" | RAG retrieval |
Op.EMBEDDINGS | "embeddings" | Embedding generation |
Op.GENERATE_CONTENT | "generate_content" | Content generation |
Op.TEXT_COMPLETION | "text_completion" | Text completions |
Any custom string also works for op.
Automatic behavior
Section titled “Automatic behavior”In decorator mode, observe() automatically:
- Captures input as
gen_ai.input.messages(orgen_ai.tool.call.argumentsfor tools). Skipsself/cls. - Captures output as
gen_ai.output.messages(orgen_ai.tool.call.resultfor tools). Won’t overwrite if already set. - Records errors as span status
ERRORwith an exception event. - Sets entity attributes -
gen_ai.agent.namefor agents,gen_ai.tool.name+gen_ai.tool.type="function"for tools.
All values truncated at 10,000 characters.
Span hierarchy
Section titled “Span hierarchy”flowchart TD
A["@observe op=INVOKE_AGENT"] --> B["@observe op=CHAT - LLM call"]
A --> C["@observe op=EXECUTE_TOOL - tool call"]
C --> D["LLM call (auto-instrumented)"]
Dispatcher pattern
Section titled “Dispatcher pattern”When the tool name is only known at call time:
@observe(op=Op.EXECUTE_TOOL, name_from="tool_name")def execute_tool(self, tool_name: str, arguments: dict) -> dict: return self._tools[tool_name](**arguments)# Produces: "execute_tool web_search", "execute_tool calculator", etc.Async support
Section titled “Async support”@observe(op=Op.EXECUTE_TOOL)async def async_search(query: str) -> list: return await search_api.query(query)enrich()
Section titled “enrich()”Adds GenAI semantic convention attributes to the currently active span. Call inside @observe-decorated functions or with observe(...) blocks.
@observe(op=Op.CHAT, name="llm_call")def call_llm(messages: list) -> str: response = openai.chat.completions.create(model="gpt-4o", messages=messages) enrich( model="gpt-4o", provider="openai", input_tokens=response.usage.prompt_tokens, output_tokens=response.usage.completion_tokens, finish_reason=response.choices[0].finish_reason, ) return response.choices[0].message.contentParameter-to-attribute mapping
Section titled “Parameter-to-attribute mapping”| Parameter | OTel Attribute |
|---|---|
model | gen_ai.request.model |
provider | gen_ai.provider.name |
input_tokens | gen_ai.usage.input_tokens |
output_tokens | gen_ai.usage.output_tokens |
total_tokens | gen_ai.usage.total_tokens |
response_id | gen_ai.response.id |
finish_reason | gen_ai.response.finish_reasons |
temperature | gen_ai.request.temperature |
max_tokens | gen_ai.request.max_tokens |
session_id | gen_ai.conversation.id |
agent_id | gen_ai.agent.id |
agent_description | gen_ai.agent.description |
tool_definitions | gen_ai.tool.definitions |
system_instructions | gen_ai.system_instructions |
input_messages | gen_ai.input.messages |
output_messages | gen_ai.output.messages |
**extra | key used as-is |
All parameters are optional. Only provided values are set.
Auto-instrumentation
Section titled “Auto-instrumentation”register() discovers and activates installed instrumentor packages. Install the extra for your provider - no code changes needed.
| Provider / framework | Extra |
|---|---|
| OpenAI, OpenAI Agents | [openai] |
| Anthropic | [anthropic] |
| Amazon Bedrock | [bedrock] |
| Google Generative AI + Vertex AI | [google] |
| LangChain | [langchain] |
| LlamaIndex | [llamaindex] |
| All providers + frameworks | [otel-instrumentors] |
[otel-instrumentors] includes all of the above plus Cohere, Mistral, Groq, Ollama, Together, Replicate, Writer, Voyage AI, Aleph Alpha, SageMaker, watsonx, Haystack, CrewAI, Agno, MCP, Transformers, ChromaDB, Pinecone, Qdrant, Weaviate, Milvus, LanceDB, and Marqo.
register(auto_instrument=False) # to disablescore()
Section titled “score()”Attaches an evaluation score to a trace or span. Scores are emitted as OTEL spans through the same OTLP pipeline - no separate client or index needed.
from opensearch_genai_observability_sdk_py import score
# Score an entire tracescore(name="relevance", value=0.92, trace_id="abc123...", explanation="Response addresses the user's query")
# Score a specific spanscore(name="accuracy", value=0.95, trace_id="abc123...", span_id="def456...", label="pass")| Parameter | Type | Description |
|---|---|---|
name | str | Metric name, e.g. "relevance", "factuality". |
value | float | Numeric score. |
trace_id | str | Hex trace ID to score. Omit for standalone scores. |
span_id | str | Hex span ID for span-level scoring. |
label | str | Human-readable label, e.g. "pass". |
explanation | str | Evaluator rationale (truncated to 500 chars). |
response_id | str | LLM completion ID for correlation. |
attributes | dict | Additional span attributes. |
For running evaluations at scale (evaluate(), Benchmark, OpenSearchTraceRetriever), see Evaluation & Scoring.
AWS authentication
Section titled “AWS authentication”For AWS-hosted endpoints, use AWSSigV4OTLPExporter to sign requests with SigV4:
from opensearch_genai_observability_sdk_py import AWSSigV4OTLPExporter, register
exporter = AWSSigV4OTLPExporter( endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces", service="osis", # "osis" for OSIS, "es" for OpenSearch Service region="us-east-1", # auto-detected if omitted)register(service_name="my-agent", exporter=exporter)Credentials: AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY -> ~/.aws/credentials -> IAM role/IMDS.
Environment variables
Section titled “Environment variables”| Variable | Description | Default |
|---|---|---|
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT | Full OTLP traces endpoint URL (used as-is) | |
OTEL_EXPORTER_OTLP_ENDPOINT | Base OTLP endpoint URL (/v1/traces appended) | Data Prepper default |
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL | Protocol for traces (http/protobuf, grpc) | |
OTEL_EXPORTER_OTLP_PROTOCOL | Protocol for all signals (http/protobuf, grpc) | |
OTEL_SERVICE_NAME | Service name | "default" |
OTEL_SERVICE_VERSION | Service version | |
OPENSEARCH_PROJECT | Project name (fallback for service_name) | "default" |
Related links
Section titled “Related links”- AI Observability - Getting Started - end-to-end walkthrough
- Evaluation & Scoring - score traces, run experiments
- Trace Retrieval - query stored traces from OpenSearch
- Agent Traces - viewing traces in OpenSearch Dashboards
- GenAI semantic conventions - OTel spec reference