Skip to content

Python SDK

opensearch-genai-observability-sdk-py instruments Python AI agent applications using standard OpenTelemetry. It configures the OTEL pipeline in one call, provides a unified observe() primitive for tracing agents and tools, and enriches spans with GenAI semantic convention attributes automatically.

Terminal window
pip install opensearch-genai-observability-sdk-py

Auto-instrumentation for LLM providers is opt-in:

Terminal window
pip install "opensearch-genai-observability-sdk-py[openai]"
pip install "opensearch-genai-observability-sdk-py[anthropic]"
pip install "opensearch-genai-observability-sdk-py[bedrock]"
pip install "opensearch-genai-observability-sdk-py[langchain]"
pip install "opensearch-genai-observability-sdk-py[llamaindex]"
pip install "opensearch-genai-observability-sdk-py[otel-instrumentors]" # all providers
pip install "opensearch-genai-observability-sdk-py[opensearch]" # trace retrieval
pip install "opensearch-genai-observability-sdk-py[all]" # everything

The SDK exports these functions and classes. This page covers the instrumentation APIs. Evaluation APIs are documented in Evaluation & Scoring.

ExportPurposeDocs
register()Configure OTEL pipelineThis page
observe()Trace agents, tools, LLM callsThis page
OpOperation name constantsThis page
enrich()Set GenAI attributes on active spanThis page
score()Attach evaluation scores to tracesThis page
AWSSigV4OTLPExporterSigV4-signed OTLP exporterThis page
evaluate()Run agent against dataset with scorersEvaluation & Scoring
ExperimentUpload pre-computed eval resultsEvaluation & Scoring
EvalScoreScorer return typeEvaluation & Scoring
OpenSearchTraceRetrieverQuery stored traces from OpenSearchEvaluation & Scoring
from opensearch_genai_observability_sdk_py import register, observe, Op, enrich
register(endpoint="http://localhost:4318/v1/traces", service_name="my-agent")
@observe(op=Op.EXECUTE_TOOL)
def get_weather(city: str) -> dict:
return {"city": city, "temp": 22, "condition": "sunny"}
@observe(op=Op.INVOKE_AGENT)
def assistant(query: str) -> str:
enrich(model="gpt-4o", provider="openai")
data = get_weather("Paris")
return f"{data['condition']}, {data['temp']}C"
result = assistant("What's the weather?")

Configures the OTEL tracing pipeline. Call once at startup before any tracing occurs.

from opensearch_genai_observability_sdk_py import register
register(
endpoint="http://localhost:4318/v1/traces",
service_name="my-app",
)
ParameterTypeDefaultDescription
endpointstrData Prepper defaultOTLP endpoint URL. Reads OTEL_EXPORTER_OTLP_TRACES_ENDPOINT or OTEL_EXPORTER_OTLP_ENDPOINT if not set.
protocol"http" | "grpc"inferred from URLForce transport. grpc:// -> gRPC, grpcs:// -> gRPC+TLS, else HTTP.
service_namestr"default"Attached as service.name. Reads OTEL_SERVICE_NAME.
project_namestrAlias for service_name. Reads OPENSEARCH_PROJECT.
service_versionstrSets service.version. Reads OTEL_SERVICE_VERSION.
batchboolTrueTrue = BatchSpanProcessor, False = SimpleSpanProcessor.
auto_instrumentboolTrueDiscover and activate installed OTel instrumentor packages.
exporterSpanExporterCustom exporter. Overrides endpoint, protocol, headers.
set_globalboolTrueRegister as the global TracerProvider.
headersdictAdditional HTTP headers for the OTLP exporter.

Returns the configured TracerProvider.

URL schemeTransport
http:// or https://OTLP HTTP (default)
grpc://OTLP gRPC, insecure
grpcs://OTLP gRPC with TLS
# Self-hosted with Data Prepper
register(service_name="my-agent")
# OTel Collector on localhost
register(endpoint="http://localhost:4318/v1/traces", service_name="my-agent")
# gRPC
register(endpoint="grpc://localhost:4317", service_name="my-agent")
# AWS OpenSearch Ingestion with SigV4
from opensearch_genai_observability_sdk_py import AWSSigV4OTLPExporter
exporter = AWSSigV4OTLPExporter(
endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces",
service="osis", region="us-east-1",
)
register(service_name="my-agent", exporter=exporter)

The unified tracing primitive. Works as a decorator (sync, async, generator, async generator) and as a context manager.

# Bare decorator - span name = function qualname
@observe
def my_function():
...
# Parameterized - set operation type, name, span kind
@observe(name="weather_agent", op=Op.INVOKE_AGENT)
def run_agent(query: str) -> str:
...
# Context manager - for inline tracing blocks
with observe("llm_call", op=Op.CHAT) as span:
response = llm.chat(messages)
ParameterTypeDefaultDescription
namestrfunction __qualname__Span entity name.
opstrSets gen_ai.operation.name. Span name becomes "{op} {name}".
kindSpanKindINTERNALOTel SpanKind.
name_fromstrFunction parameter whose runtime value becomes the span name.
ConstantValueUse for
Op.INVOKE_AGENT"invoke_agent"Agent invocations and orchestration
Op.EXECUTE_TOOL"execute_tool"Tool/function calls
Op.CHAT"chat"LLM chat completions
Op.CREATE_AGENT"create_agent"Agent initialization
Op.RETRIEVAL"retrieval"RAG retrieval
Op.EMBEDDINGS"embeddings"Embedding generation
Op.GENERATE_CONTENT"generate_content"Content generation
Op.TEXT_COMPLETION"text_completion"Text completions

Any custom string also works for op.

In decorator mode, observe() automatically:

  • Captures input as gen_ai.input.messages (or gen_ai.tool.call.arguments for tools). Skips self/cls.
  • Captures output as gen_ai.output.messages (or gen_ai.tool.call.result for tools). Won’t overwrite if already set.
  • Records errors as span status ERROR with an exception event.
  • Sets entity attributes - gen_ai.agent.name for agents, gen_ai.tool.name + gen_ai.tool.type="function" for tools.

All values truncated at 10,000 characters.

flowchart TD
    A["@observe op=INVOKE_AGENT"] --> B["@observe op=CHAT - LLM call"]
    A --> C["@observe op=EXECUTE_TOOL - tool call"]
    C --> D["LLM call (auto-instrumented)"]

When the tool name is only known at call time:

@observe(op=Op.EXECUTE_TOOL, name_from="tool_name")
def execute_tool(self, tool_name: str, arguments: dict) -> dict:
return self._tools[tool_name](**arguments)
# Produces: "execute_tool web_search", "execute_tool calculator", etc.
@observe(op=Op.EXECUTE_TOOL)
async def async_search(query: str) -> list:
return await search_api.query(query)

Adds GenAI semantic convention attributes to the currently active span. Call inside @observe-decorated functions or with observe(...) blocks.

@observe(op=Op.CHAT, name="llm_call")
def call_llm(messages: list) -> str:
response = openai.chat.completions.create(model="gpt-4o", messages=messages)
enrich(
model="gpt-4o",
provider="openai",
input_tokens=response.usage.prompt_tokens,
output_tokens=response.usage.completion_tokens,
finish_reason=response.choices[0].finish_reason,
)
return response.choices[0].message.content
ParameterOTel Attribute
modelgen_ai.request.model
providergen_ai.provider.name
input_tokensgen_ai.usage.input_tokens
output_tokensgen_ai.usage.output_tokens
total_tokensgen_ai.usage.total_tokens
response_idgen_ai.response.id
finish_reasongen_ai.response.finish_reasons
temperaturegen_ai.request.temperature
max_tokensgen_ai.request.max_tokens
session_idgen_ai.conversation.id
agent_idgen_ai.agent.id
agent_descriptiongen_ai.agent.description
tool_definitionsgen_ai.tool.definitions
system_instructionsgen_ai.system_instructions
input_messagesgen_ai.input.messages
output_messagesgen_ai.output.messages
**extrakey used as-is

All parameters are optional. Only provided values are set.


register() discovers and activates installed instrumentor packages. Install the extra for your provider - no code changes needed.

Provider / frameworkExtra
OpenAI, OpenAI Agents[openai]
Anthropic[anthropic]
Amazon Bedrock[bedrock]
LangChain[langchain]
LlamaIndex[llamaindex]
Cohere[cohere]
Mistral[mistral]
Groq[groq]
Ollama[ollama]
Google Generative AI + Vertex AI[google]
All of the above + 20 more[otel-instrumentors]
register(auto_instrument=False) # to disable

Attaches an evaluation score to a trace or span. Scores are emitted as OTEL spans through the same OTLP pipeline - no separate client or index needed.

from opensearch_genai_observability_sdk_py import score
# Score an entire trace
score(name="relevance", value=0.92, trace_id="abc123...",
explanation="Response addresses the user's query")
# Score a specific span
score(name="accuracy", value=0.95, trace_id="abc123...", span_id="def456...",
label="pass")
ParameterTypeDescription
namestrMetric name, e.g. "relevance", "factuality".
valuefloatNumeric score.
trace_idstrHex trace ID to score. Omit for standalone scores.
span_idstrHex span ID for span-level scoring.
labelstrHuman-readable label, e.g. "pass".
explanationstrEvaluator rationale (truncated to 500 chars).
response_idstrLLM completion ID for correlation.
attributesdictAdditional span attributes.

For running evaluations at scale (evaluate(), Experiment, OpenSearchTraceRetriever), see Evaluation & Scoring.


For AWS-hosted endpoints, use AWSSigV4OTLPExporter to sign requests with SigV4:

from opensearch_genai_observability_sdk_py import AWSSigV4OTLPExporter, register
exporter = AWSSigV4OTLPExporter(
endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces",
service="osis", # "osis" for OSIS, "es" for OpenSearch Service
region="us-east-1", # auto-detected if omitted
)
register(service_name="my-agent", exporter=exporter)

Credentials: AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY -> ~/.aws/credentials -> IAM role/IMDS.


VariableDescriptionDefault
OTEL_EXPORTER_OTLP_TRACES_ENDPOINTOTLP traces endpoint
OTEL_EXPORTER_OTLP_ENDPOINTOTLP endpoint (appends /v1/traces)Data Prepper default
OTEL_SERVICE_NAMEService name"default"
OPENSEARCH_PROJECTProject name (fallback)"default"
OTEL_SERVICE_VERSIONService version