OpenTelemetry

OpenTelemetry (OTel) is the CNCF standard for generating, collecting, and exporting telemetry data. The OpenSearch Observability Stack uses OpenTelemetry as its primary ingestion interface — all telemetry flows through OTel SDKs and the OTel Collector before reaching storage backends.

This section covers the OTel Collector configuration, instrumentation approaches, and sampling strategies.

Telemetry Signals

OpenTelemetry defines three core signal types:

Traces

A trace represents a single request as it propagates through your system. Each trace is composed of spans — units of work with a start time, duration, status, and parent-child relationships.

Traces answer questions like:

Which services handled this request?
Where did latency occur?
What caused the error?

The stack stores traces in OpenSearch via Data Prepper, indexed in otel-v1-apm-span-* indices.

Metrics

A metric is a numerical measurement captured over time. OTel supports counters, histograms, gauges, and exponential histograms.

Metrics answer questions like:

What is the request rate for this service?
What is the p99 latency?
How much memory is the process using?

The stack routes metrics to Prometheus via OTLP HTTP for efficient time-series storage and querying.

Logs

A log is a timestamped text or structured record emitted by an application. OTel logs support trace correlation, meaning each log record can carry a traceId and spanId to link it to the distributed trace that was active when the log was emitted.

The stack stores logs in OpenSearch via Data Prepper, enabling full-text search and trace-correlated log exploration.

OTLP Protocol

All three signals are transmitted using the OpenTelemetry Protocol (OTLP), which supports two transports:

Transport	Endpoint	Encoding	Best For
gRPC	`http://localhost:4317`	Protobuf	Backend services, high throughput
HTTP	`http://localhost:4318`	Protobuf or JSON	Browsers, serverless, restricted networks

OTLP is the only protocol you need. Unlike vendor-specific formats (Zipkin, Jaeger, StatsD), OTLP carries all three signals over a single connection, reducing operational complexity.

Architecture

The following diagram shows how OpenTelemetry components fit into the stack:

flowchart TD
    subgraph Application
        SDK["OTel SDK<br/>(TracerProvider, MeterProvider, LoggerProvider)"]
        Auto["Auto-Instrumentation<br/>(framework hooks)"]
        Manual["Manual Instrumentation<br/>(custom spans, metrics)"]
        Auto --> SDK
        Manual --> SDK
    end

    SDK -->|"OTLP"| Collector["OTel Collector"]

    subgraph Collector Pipeline
        R["Receivers<br/>otlp (4317, 4318)"]
        P["Processors<br/>batch, memory_limiter,<br/>resourcedetection, transform"]
        E["Exporters<br/>otlp/opensearch, otlphttp/prometheus"]
        R --> P --> E
    end

    Collector --> DP["Data Prepper :21890"]
    Collector --> Prom["Prometheus :9090"]
    DP --> OS["OpenSearch"]
    Prom --> OSD["OpenSearch Dashboards"]
    OS --> OSD

Key points:

SDKs run in your application process. They create spans, record metrics, and bridge log frameworks.
Auto-instrumentation hooks into frameworks and libraries to generate telemetry without code changes.
Manual instrumentation lets you add custom spans, attributes, and metrics for business-specific observability.
The OTel Collector receives, processes, and exports telemetry. It runs as a standalone service (not embedded in your app).
Data Prepper receives traces and logs from the Collector and writes them to OpenSearch indices.
Prometheus receives metrics from the Collector via OTLP HTTP.

Semantic Conventions

OpenTelemetry defines semantic conventions — standardized attribute names for common concepts. The stack relies on these conventions for its dashboards and visualizations:

Convention	Prefix	Example Attributes
HTTP	`http.`, `url.`	`http.request.method`, `url.path`, `http.response.status_code`
Database	`db.*`	`db.system.name`, `db.query.text`, `db.operation.name`
RPC	`rpc.*`	`rpc.system`, `rpc.service`, `rpc.method`
Gen-AI	`gen_ai.*`	`gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.input_tokens`
Service	`service.*`	`service.name`, `service.version`, `service.namespace`

Using standard attribute names means the APM dashboards, service maps, and agent trace views work out of the box.

In This Section

Collector Configuration — Full walkthrough of the OTel Collector pipeline config
Auto-Instrumentation — Zero-code instrumentation for popular languages
Manual Instrumentation — Custom spans, metrics, and logs with the OTel SDK
Sampling Strategies — Control data volume with head-based and tail-based sampling

Send Data Overview — Full ingestion architecture
Agent Traces — AI agent trace visualization
APM Services — Service health monitoring
What is OpenTelemetry? — Official overview of the OpenTelemetry project
OpenTelemetry Semantic Conventions — Standard attribute naming reference