Skip to content

Python SDK

opensearch-genai-sdk-py instruments Python LLM applications using standard OpenTelemetry. It configures the OTEL pipeline in one call, provides decorators for tracing your application logic, and emits evaluation scores through the same OTLP exporter.

Terminal window
pip install opensearch-genai-sdk-py

The core package includes the OTEL SDK and OTLP exporters. Auto-instrumentation for LLM providers is opt-in:

Terminal window
pip install "opensearch-genai-sdk-py[openai]"
pip install "opensearch-genai-sdk-py[anthropic]"
pip install "opensearch-genai-sdk-py[bedrock]"
pip install "opensearch-genai-sdk-py[langchain]"
pip install "opensearch-genai-sdk-py[instrumentors]" # all providers at once
pip install "opensearch-genai-sdk-py[aws]" # SigV4 signing for AWS endpoints
pip install "opensearch-genai-sdk-py[all]" # everything

Available provider extras: openai, anthropic, cohere, mistral, groq, ollama, google, bedrock, langchain, llamaindex.

from opensearch_genai_sdk_py import register, workflow, agent, tool, score
register(endpoint="http://localhost:4318/v1/traces", service_name="my-app")
@tool(name="get_weather")
def get_weather(city: str) -> dict:
"""Fetch current weather for a city."""
return {"city": city, "temp": 22, "condition": "sunny"}
@agent(name="weather_assistant")
def assistant(query: str) -> str:
data = get_weather("Paris")
return f"{data['condition']}, {data['temp']}C"
@workflow(name="weather_pipeline")
def run(query: str) -> str:
return assistant(query)
result = run("What's the weather?")
score(name="relevance", value=0.95, trace_id="...", source="llm-judge")

Configures the OTEL tracing pipeline. Call once at startup before any tracing occurs.

from opensearch_genai_sdk_py import register
register(
endpoint="http://localhost:4318/v1/traces",
service_name="my-app",
)
ParameterTypeDefaultDescription
endpointstrhttp://localhost:21890/opentelemetry/v1/tracesOTLP endpoint URL. Reads OPENSEARCH_OTEL_ENDPOINT if not set.
protocol"http" | "grpc"inferred from URLForce transport. Inferred from scheme if omitted: grpc:// → gRPC, grpcs:// → gRPC+TLS, else HTTP.
service_namestr"default"Attached to all spans as service.name. Reads OTEL_SERVICE_NAME.
project_namestrAlias for service_name.
authstr"auto""auto" detects AWS endpoints and enables SigV4. "sigv4" always signs. "none" never signs.
regionstrautoAWS region for SigV4. Auto-detected from botocore if not provided.
servicestr"osis"AWS service name for signing. "osis" for OpenSearch Ingestion, "es" for OpenSearch Service.
batchboolTrueTrue uses BatchSpanProcessor (production). False uses SimpleSpanProcessor (debugging).
auto_instrumentboolTrueDiscovers and activates installed OTel instrumentor packages.
exporterSpanExporterCustom exporter. Overrides endpoint, auth, and protocol.
set_globalboolTrueRegister as the global TracerProvider.
headersdictAdditional HTTP headers for the exporter.

register() returns the configured TracerProvider.

URL schemeTransport
http:// or https://OTLP HTTP (default)
grpc://OTLP gRPC, insecure
grpcs://OTLP gRPC with TLS

Self-hosted OpenSearch with a local collector:

register(service_name="my-app")
# uses http://localhost:21890/opentelemetry/v1/traces by default

AWS OpenSearch Ingestion with SigV4:

register(
endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces",
service_name="my-app",
auth="sigv4",
region="us-east-1",
)

gRPC:

register(endpoint="grpc://localhost:4317", service_name="my-app")

Custom exporter:

from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
register(
service_name="my-app",
exporter=OTLPSpanExporter(endpoint="http://localhost:4318/v1/traces"),
)

Four decorators trace application logic as OTEL spans with GenAI semantic convention attributes. All four support sync functions, async functions, generators, and async generators. Errors are recorded as span status ERROR with an exception event.

A typical trace looks like this:

flowchart TD
    A["@workflow  — invoke_agent  — SpanKind.INTERNAL"] --> B["@agent  — invoke_agent  — SpanKind.CLIENT"]
    B --> C["@tool  — execute_tool  — SpanKind.INTERNAL"]
    B --> D["LLM call  (auto-instrumented)"]

@agent defaults to SpanKind.CLIENT because it typically represents a call out to an external LLM or service.

All four decorators accept the same parameters:

ParameterTypeDescription
namestrSpan name and entity name. Defaults to the function’s __qualname__.
versionintStored as gen_ai.agent.version.
kindSpanKindOverride the OTel SpanKind.
name_fromstrName of a function parameter whose runtime value becomes the entity name. Useful for dispatcher patterns.

Top-level orchestration. Creates a span with gen_ai.operation.name = "invoke_agent" and SpanKind.INTERNAL.

from opensearch_genai_sdk_py import workflow
@workflow(name="qa_pipeline")
def run_pipeline(query: str) -> str:
plan = plan_steps(query)
return execute(plan)

Span attributes set automatically: gen_ai.operation.name, gen_ai.agent.name, gen_ai.input.messages, gen_ai.output.messages.

A discrete unit of work within a workflow. Same attributes and defaults as @workflow.

from opensearch_genai_sdk_py import task
@task(name="summarize")
def summarize_text(text: str) -> str:
return llm.generate(f"Summarize: {text}")

Autonomous decision-making logic. Defaults to SpanKind.CLIENT. Span name is prefixed: invoke_agent <name>.

from opensearch_genai_sdk_py import agent
@agent(name="research_agent", version=2)
async def research(query: str) -> str:
while not done:
action = decide_action(query)
result = await execute_action(action)
return result

A function invoked by an agent. Span name is prefixed: execute_tool <name>.

from opensearch_genai_sdk_py import tool
@tool(name="web_search")
def search(query: str) -> list[dict]:
"""Search the web for documents."""
return search_api.query(query)

Additional attributes set on tool spans: gen_ai.tool.name, gen_ai.tool.type ("function"), gen_ai.tool.description (first line of docstring), gen_ai.tool.call.arguments, gen_ai.tool.call.result.

Dispatcher pattern — when the tool name is only known at call time, use name_from to resolve it from a runtime argument:

@tool(name_from="tool_name")
def execute_tool(self, tool_name: str, arguments: dict) -> dict:
"""Routes calls to the appropriate tool implementation."""
return self._tools[tool_name](**arguments)

Each call produces a span named execute_tool <actual_tool_name>.

If you set gen_ai.output.messages (or gen_ai.tool.call.result for tools) inside the function body, the decorator will not overwrite it:

from opentelemetry import trace
import json
@agent(name="my_agent")
def my_agent(query: str) -> str:
result = do_work(query)
span = trace.get_current_span()
span.set_attribute(
"gen_ai.output.messages",
json.dumps([{"role": "assistant", "content": result}])
)
return result

Submits an evaluation score as an OTEL span. Scores flow through the same OTLP pipeline as traces and land in the same OpenSearch index.

from opensearch_genai_sdk_py import score

Score a specific span — a single LLM call or tool execution:

score(
name="accuracy",
value=0.95,
trace_id="abc123",
span_id="def456",
explanation="Answer matches ground truth",
source="heuristic",
)

Score an entire workflow run:

score(
name="relevance",
value=0.92,
trace_id="abc123",
explanation="Response addresses the user's query",
source="llm-judge",
)

Score across multiple traces in a conversation:

score(
name="user_satisfaction",
value=0.88,
conversation_id="session-123",
label="satisfied",
source="human",
)
ParameterTypeDescription
namestrMetric name, e.g. "relevance", "factuality".
valuefloatNumeric score.
trace_idstrTrace being scored. Stored as gen_ai.evaluation.trace_id, not the span’s own trace ID.
span_idstrSpan being scored (span-level).
conversation_idstrSession ID (session-level).
labelstrHuman-readable label, e.g. "pass", "relevant".
explanationstrEvaluator rationale. Truncated to 500 characters.
response_idstrLLM completion ID for correlation.
sourcestrWho created the score: "sdk", "human", "llm-judge", "heuristic".
metadatadictArbitrary key-value metadata, stored as gen_ai.evaluation.metadata.<key>.

Scores are emitted as gen_ai.evaluation.result spans with gen_ai.evaluation.* attributes.

Read the trace ID from the active span context:

from opentelemetry import trace
@workflow(name="my_pipeline")
def run(query: str) -> str:
ctx = trace.get_current_span().get_span_context()
trace_id = format(ctx.trace_id, "032x")
result = do_work(query)
return result
# After run() returns, score using the captured trace_id

For AWS-hosted endpoints (OpenSearch Ingestion or OpenSearch Service), requests must be signed with AWS SigV4.

register() handles this automatically when auth="sigv4" or when auth="auto" detects an *.amazonaws.com endpoint. Requires the [aws] extra:

Terminal window
pip install "opensearch-genai-sdk-py[aws]"
register(
endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces",
service_name="my-app",
auth="sigv4",
region="us-east-1", # auto-detected from botocore if not set
service="osis", # "osis" for OSIS pipelines, "es" for OpenSearch Service
)

Credentials are resolved via the standard botocore chain: AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY env vars → ~/.aws/credentials → IAM role / IMDS.

The exporter used internally by register() when SigV4 is enabled. Use it directly when you need more control:

from opensearch_genai_sdk_py import AWSSigV4OTLPExporter, register
exporter = AWSSigV4OTLPExporter(
endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces",
service="osis",
region="us-east-1",
)
register(service_name="my-app", exporter=exporter)

register() discovers and activates installed instrumentor packages via OTEL entry points. Install the extra for your LLM provider and its calls are traced automatically — no code changes needed.

Provider / frameworkExtra
OpenAI, OpenAI Agents[openai]
Anthropic[anthropic]
Amazon Bedrock[bedrock]
LangChain[langchain]
LlamaIndex[llamaindex]
Cohere[cohere]
Mistral[mistral]
Groq[groq]
Ollama[ollama]
Google Generative AI + Vertex AI[google]
All of the above + more[instrumentors]

The [instrumentors] bundle also includes Together, Replicate, Writer, Voyage AI, SageMaker, watsonx, Haystack, CrewAI, Agno, MCP, Transformers, ChromaDB, Pinecone, Qdrant, Weaviate, Milvus, LanceDB, Marqo.

To disable auto-instrumentation:

register(auto_instrument=False)

VariableDescriptionDefault
OPENSEARCH_OTEL_ENDPOINTOTLP endpoint URLhttp://localhost:21890/opentelemetry/v1/traces
OTEL_SERVICE_NAMEService name for all spans"default"
OPENSEARCH_PROJECTProject name (fallback to OTEL_SERVICE_NAME)"default"
AWS_DEFAULT_REGIONAWS region for SigV4auto-detected by botocore
AWS_ACCESS_KEY_IDAWS access keybotocore credential chain
AWS_SECRET_ACCESS_KEYAWS secret keybotocore credential chain