Skip to content

grok

import { Aside } from ‘@astrojs/starlight/components’;

The grok command parses a text field using grok pattern syntax and appends the extracted fields to the search results. Grok provides over 200 predefined patterns (%{IP}, %{NUMBER}, %{HOSTNAME}, etc.) that wrap common regular expressions, making extraction more readable and less error-prone than writing raw regex.

grok <field> <grok-pattern>
ArgumentRequiredDescription
<field>YesThe text field to parse.
<grok-pattern>YesA grok pattern using %{PATTERN:fieldname} syntax. Each %{PATTERN:fieldname} creates a new string field. If a field with the same name already exists, it is overwritten. Raw regex can be mixed with grok patterns.
  • Grok patterns are built on top of regular expressions but provide a more readable, reusable syntax.
  • Use the %{PATTERN:fieldname} syntax to extract a named field. If you omit :fieldname, the match is consumed but no field is created.
  • The grok pattern must match the entire string from start to end for extraction to succeed. Use %{GREEDYDATA} or %{GREEDYDATA:name} at the end of your pattern to consume any remaining text (including trailing newlines via [\s\S]).
  • When parsing a null field, the result is an empty string.
  • Each unnamed %{PATTERN} must be unique within a single grok expression, or you will get a “Duplicate key” error. Give each pattern a unique field name to avoid this.
  • Grok shares the same limitations as the parse command.
PatternMatchesExample
%{IP:ip}IPv4 or IPv6 address192.168.1.1
%{NUMBER:num}Integer or floating-point number42, 3.14
%{WORD:word}Single word (no whitespace)ERROR
%{HOSTNAME:host}Hostname or FQDNapi.example.com
%{GREEDYDATA:msg}Everything (greedy match)any remaining text
%{IPORHOST:server}IP address or hostname10.0.0.1 or web01
%{URI:url}Full URIhttps://example.com/path?q=1
%{URIPATH:path}URI path component/api/v1/agents
%{POSINT:code}Positive integer200, 404
%{DATA:val}Non-greedy match (minimal)short text segments

Extract HTTP method, path, and status from Envoy access logs

Section titled “Extract HTTP method, path, and status from Envoy access logs”

The frontend-proxy service emits Envoy access logs in the body field. Use grok patterns to parse the timestamp, HTTP method, request path, and response status:

source=logs-otel-v1*
| where like(body, '%HTTP/1.1"%')
| grok body '\[%{DATA:ts}\] "%{WORD:method} %{DATA:path} HTTP/%{DATA:ver}" %{POSINT:status} %{GREEDYDATA:rest}'
| head 20
bodymethodpathstatus
[2026-02-26T18:04:21.634Z] “GET /api/data HTTP/1.1” 200 - via_upstream …GET/api/data200
[2026-02-26T18:04:23.059Z] “POST /api/product-ask-ai-assistant/0PUK6V6EV0 HTTP/1.1” 200 …POST/api/product-ask-ai-assistant/0PUK6V6EV0200
[2026-02-26T18:04:21.629Z] “GET /api/data/ HTTP/1.1” 308 - via_upstream …GET/api/data/308

Try in playground →

Strip the Kafka broker prefix from log bodies, keeping only the message content:

source=logs-otel-v1*
| where `resource.attributes.service.name` = 'kafka'
| where like(body, '%Broker%Creating%')
| grok body '\[%{DATA}\] %{GREEDYDATA:body}'
| head 20
body
Creating new partition __consumer_offsets-33 with topic id _xZjVwc_TO2HCCnHkcNIDg.
Creating new partition __consumer_offsets-15 with topic id _xZjVwc_TO2HCCnHkcNIDg.
Creating new partition __consumer_offsets-48 with topic id _xZjVwc_TO2HCCnHkcNIDg.

Try in playground →

Extract component name and broker ID from Kafka logs

Section titled “Extract component name and broker ID from Kafka logs”

Use grok to parse the [Component id=N] prefix from Kafka broker log bodies:

source=logs-otel-v1*
| where `resource.attributes.service.name` = 'kafka'
| grok body '\[%{DATA:component} id=%{NUMBER:brokerId}\] %{GREEDYDATA:message}'
| where length(component) > 0
| head 20
bodycomponentbrokerIdmessage
[Broker id=1] Creating new partition __consumer_offsets-33 …Broker1Creating new partition __consumer_offsets-33 …
[RaftManager id=1] Completed transition to Leader …RaftManager1Completed transition to Leader …
[QuorumController id=1] The request from broker 1 …QuorumController1The request from broker 1 …

Try in playground →

Aggregate HTTP requests by method and status

Section titled “Aggregate HTTP requests by method and status”

Parse Envoy access logs and count requests grouped by HTTP method and status code:

source=logs-otel-v1*
| where `resource.attributes.service.name` = 'frontend-proxy'
| head 1000
| grok body '\[%{DATA:ts}\] "%{WORD:method} %{DATA:path} HTTP/%{DATA:ver}" %{POSINT:status} %{GREEDYDATA:rest}'
| where length(method) > 0
| stats count() as requests by method, status
| sort - requests

Try in playground →

Extract the first word from OTel log bodies

Section titled “Extract the first word from OTel log bodies”

OpenTelemetry log bodies often start with a keyword that indicates the log type. Use grok to extract the first word and aggregate:

source=logs-otel-v1*
| head 1000
| grok body '%{WORD:first} %{GREEDYDATA:rest}'
| where length(first) > 0
| stats count() as occurrences by first
| sort - occurrences
| head 20

This extracts the first word from each log body, then counts occurrences to identify the most common log message prefixes across all services.

Try in playground →

Identify top endpoints from Envoy access logs

Section titled “Identify top endpoints from Envoy access logs”

Parse Envoy access log bodies and aggregate by HTTP method and request path to find the busiest endpoints:

source=logs-otel-v1*
| where `resource.attributes.service.name` = 'frontend-proxy'
| head 1000
| grok body '\[%{DATA:ts}\] "%{WORD:method} %{DATA:path} HTTP/%{DATA:ver}" %{POSINT:status} %{GREEDYDATA:rest}'
| where length(method) > 0
| stats count() as requests by method, path
| sort - requests
| head 20

Try in playground →

  • parse — extract fields using raw Java regex (more control, less readability)
  • rex — regex extraction with sed-mode text replacement and multiple matches
  • patterns — automatically discover log patterns without writing any patterns