References PPL syntax, commands (source, where, fields), functions, API endpoints, and bash/curl examples for OpenSearch observability queries on OTel traces and logs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/opensearch@observability:ppl-referenceThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
This is a comprehensive reference for the Piped Processing Language (PPL) used by OpenSearch. PPL queries follow a pipe-delimited syntax starting with `source=<index>` and chaining commands with `|`. This reference covers all commands, functions, API endpoints, and usage patterns needed to construct observability queries against trace and log indices.
This is a comprehensive reference for the Piped Processing Language (PPL) used by OpenSearch. PPL queries follow a pipe-delimited syntax starting with source=<index> and chaining commands with |. This reference covers all commands, functions, API endpoints, and usage patterns needed to construct observability queries against trace and log indices.
Grammar sourced from the opensearch-project/sql repository's docs/user/ppl/ directory:
https://github.com/opensearch-project/sql
| Variable | Default | Description |
|---|---|---|
OPENSEARCH_ENDPOINT | https://localhost:9200 | OpenSearch base URL |
OPENSEARCH_USER | admin | OpenSearch username |
OPENSEARCH_PASSWORD | My_password_123!@# | OpenSearch password |
Field names containing dots must be enclosed in backticks to avoid parsing errors:
`attributes.gen_ai.operation.name`
`attributes.gen_ai.usage.input_tokens`
`status.code`
`events.attributes.exception.type`
`@timestamp`
This is critical for OTel attribute fields which use dotted naming conventions.
Execute a PPL query:
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() by serviceName"}'
Request body: {"query": "<ppl_query>"}
Retrieve the query execution plan (useful for debugging and profiling):
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl/_explain" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | stats count() by serviceName"}'
Start a query by specifying the data source index pattern.
Syntax: search source=<index-pattern> or source=<index-pattern>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | head 10"}'
Filter results based on a condition.
Syntax: where <condition>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | head 10"}'
Supports: =, !=, <, >, <=, >=, AND, OR, NOT, LIKE, IN, BETWEEN, IS NULL, IS NOT NULL.
Select specific fields to return.
Syntax: fields [+|-] <field-list>
Use + to include or - to exclude fields.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fields traceId, spanId, serviceName, durationInNanos | head 10"}'
Aggregate data using statistical functions.
Syntax: stats <aggregation>... [by <field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count, avg(durationInNanos) as avg_duration by serviceName"}'
Supports: count(), sum(), avg(), max(), min(), var_samp(), var_pop(), stddev_samp(), stddev_pop(), distinct_count(), percentile(), earliest(), latest(), list(), values(), first(), last().
Order results by one or more fields.
Syntax: sort [+|-] <field> [, ...]
Use + for ascending (default), - for descending.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort - durationInNanos | head 10"}'
Limit the number of results returned.
Syntax: head [N] (default N=10)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | head 5"}'
Compute new fields from expressions.
Syntax: eval <new-field> = <expression> [, ...]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
Remove duplicate results based on field values.
Syntax: dedup [N] <field-list> [keepempty=<bool>] [consecutive=<bool>]
Caveat:
dedupmay throw a ClassCastException on fields with mixed types (e.g., a field that contains both strings and numbers across documents). Ensure the dedup field has a consistent type.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | dedup serviceName | fields serviceName"}'
Rename one or more fields.
Syntax: rename <old-field> AS <new-field> [, ...]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | rename serviceName as service, durationInNanos as duration | fields traceId, service, duration | head 10"}'
Find the most frequent values for a field.
Syntax: top [N] <field> [by <group-field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | top 5 serviceName"}'
Find the least frequent values for a field.
Syntax: rare <field> [by <group-field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | rare `attributes.gen_ai.operation.name`"}'
Display results in tabular format (alias for fields in some contexts).
Syntax: table <field-list>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | table traceId, spanId, serviceName, name | head 10"}'
Aggregate data into time buckets for time-series visualization.
Syntax: timechart span=<interval> <aggregation>... [by <field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=5m count() as span_count by serviceName"}'
Rate functions for timechart: per_second(), per_minute(), per_hour(), per_day() — compute rate of an aggregation per time unit.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=1h per_minute(count()) as spans_per_min by serviceName"}'
General charting command for aggregation over arbitrary fields.
Syntax: chart <aggregation>... by <field> [span(<field>, <interval>)]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | chart avg(durationInNanos) by serviceName"}'
Bucket numeric or date values into intervals.
Syntax: eval <new-field> = bin(<field>, <interval>) or used within stats ... by span(<field>, <interval>)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() by span(durationInNanos, 1000000000)"}'
Calculate moving averages over sorted data.
Syntax: trendline [sort <field>] sma(<period>, <field>) [as <alias>]
SMA = Simple Moving Average.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | trendline sort startTime sma(10, durationInNanos) as avg_duration | fields startTime, durationInNanos, avg_duration | head 50"}'
Compute running (cumulative) statistics over ordered results.
Syntax: streamstats <aggregation>... [by <field>]
Caveat:
streamstatsprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforestreamstatsto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort startTime | streamstats count() as running_count, sum(`attributes.gen_ai.usage.input_tokens`) as cumulative_tokens | fields startTime, running_count, cumulative_tokens | head 50"}'
Add aggregation results as new fields to each row without collapsing rows (unlike stats).
Syntax: eventstats <aggregation>... [by <field>]
Caveat:
eventstatsprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforeeventstatsto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eventstats avg(durationInNanos) as avg_svc_duration by serviceName | eval deviation = durationInNanos - avg_svc_duration | fields traceId, serviceName, durationInNanos, avg_svc_duration, deviation | sort - deviation | head 20"}'
Extract fields from text using a regular expression with named groups.
Syntax: parse <field> '<regex-with-named-groups>'
Caveat:
parsemay silently drop extracted fields on some OpenSearch versions. If extracted fields are missing from results, usegrokorrexas more reliable alternatives.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | parse body '\''(?P<level>\\w+): (?P<msg>.+)'\'' | fields level, msg | head 10"}'
Extract fields using Grok patterns (predefined regex patterns).
Syntax: grok <field> '<grok-pattern>'
Caveat:
grokprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforegrokto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | grok body '\''%{LOGLEVEL:level} %{GREEDYDATA:message}'\'' | fields level, message | head 10"}'
Extract fields using named capture groups (similar to parse but with Splunk-compatible syntax).
Syntax: rex field=<field> '<regex-with-named-groups>'
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | rex field=body '\''(?<statuscode>\\d{3})'\'' | fields statuscode, body | head 10"}'
Filter results using a regular expression match on a field.
Syntax: <field> = regex '<pattern>' (used within where)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | where body like '\''%error%'\'' | fields traceId, body, severityText | head 10"}'
Auto-discover log patterns by clustering similar log messages.
Syntax: patterns <field> [pattern='<regex>'] [new_field=<name>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | patterns body | fields body, patterns_field | head 20"}'
Extract values from structured data (JSON, XML) using path expressions.
Syntax: spath input=<field> [path=<path>] [output=<field>]
Note: Verify the target field exists in your index before using
spath. Rundescribe <index-pattern>first to confirm the field name. The example below uses a representative field; adjust to match your actual schema.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where isnotnull(`attributes.gen_ai.tool.name`) | spath input=`attributes.gen_ai.tool.name` | head 10"}'
Join results from two indices.
Syntax: join left=<alias> right=<alias> ON <condition> <right-source> or join <right-source> on <field>
Types: inner, left, right, cross.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | join left=s right=l ON s.traceId = l.traceId logs-otel-v1-* | fields s.spanId, s.name, l.severityText, l.body | head 10"}'
Enrich results by looking up values from another index.
Syntax: lookup <lookup-index> <match-field> [AS <alias>] [OUTPUT <field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | lookup otel-v2-apm-service-map serviceName AS `sourceNode` | fields serviceName, `targetNode`, durationInNanos | head 10"}'
Note: The service map index (
otel-v2-apm-service-map) uses nested fieldssourceNodeandtargetNode, notserviceName. Match accordingly when joining or looking up against this index.
Perform graph traversal lookups for hierarchical or connected data.
Syntax: graphlookup <index> connectFromField=<field> connectToField=<field> [maxDepth=<N>] [as <alias>]
Caveat:
graphlookuphas limited support in OpenSearch 3.x PPL and may not work as expected. Test carefully against your OpenSearch version before relying on this command in production queries.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v2-apm-service-map | graphlookup otel-v2-apm-service-map connectFromField=`destination.domain` connectToField=serviceName maxDepth=3 as dependencies | head 10"}'
Use a nested query as a data source or filter.
Syntax: where <field> IN [ source=<index> | ... | fields <field> ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where traceId IN [ source=otel-v1-apm-span-* | where `status.code` = 2 | fields traceId ] | fields traceId, spanId, serviceName, name | head 20"}'
Append results from another query to the current result set.
Syntax: append [ source=<index> | ... ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | append [ source=logs-otel-v1-* | stats count() as cnt by `resource.attributes.service.name` ] | head 20"}'
Append columns from another query to the current result set.
Syntax: appendcol [ <commands> ]
Caveat:
source=is not valid insideappendcol []subqueries. The subquery insideappendcoloperates on the current result set, not a new index. Useappendfollowed by reshaping if you need to bring in data from another index.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count | appendcol [ source=logs-otel-v1-* | stats count() as log_count ]"}'
Append the results of a sub-pipeline to the current results.
Syntax: appendpipe [ <commands> ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | appendpipe [ stats sum(cnt) as total ]"}'
Replace null values with a specified value.
Syntax: fillnull [with <value>] [<field-list>]
Caveat: Backtick-quoted field names are not supported in the
fillnullfield list. Useevalto rename dotted fields to simple names before applyingfillnull, or applyfillnullwithout a field list to fill all null fields.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fillnull with 0 `attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens` | fields traceId, `attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens` | head 10"}'
Flatten nested fields into top-level fields.
Syntax: flatten <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | flatten events | head 10"}'
Expand multi-value or array fields into separate rows.
Syntax: expand <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | expand events | fields traceId, spanId, events | head 20"}'
Pivot rows into columns.
Syntax: transpose [<N>] [include_null=<bool>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | transpose"}'
Convert field types (e.g., string to number).
Syntax: convert <function>(<field>) [as <alias>]
Functions: auto(), num(), ip(), ctime(), dur2sec(), mktime(), mstime(), rmcomma(), rmunit(), memk(), none().
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_str = CAST(durationInNanos AS STRING) | convert num(duration_str) as duration_num | fields traceId, duration_num | head 10"}'
Replace values in a field using the replace() string function inside eval.
Syntax: eval <field> = replace(<field>, '<old>', '<new>')
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | eval severityText = replace(severityText, '\''ERROR'\'', '\''ERR'\'') | fields severityText, body | head 10"}'
Reverse the order of results.
Syntax: reverse
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort startTime | head 20 | reverse"}'
Expand a multi-value field into separate rows (one row per value).
Syntax: mvexpand <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | mvexpand events | fields traceId, spanId, events | head 20"}'
Combine multiple rows with the same key into a single row with a multi-value field.
Syntax: mvcombine <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fields traceId, serviceName | mvcombine serviceName | head 10"}'
Convert a multi-value field to a single-value field (takes the first value or joins with a delimiter).
Syntax: nomv <field>
Caveat:
nomvonly works on string arrays, not nested object arrays. If the field contains nested objects (e.g.,eventswith sub-fields),nomvwill fail or produce unexpected results. Useflattenorexpandfor nested object arrays instead.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | nomv events | fields traceId, events | head 10"}'
Add a summary row at the bottom with column totals.
Syntax: addcoltotals [<field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | addcoltotals"}'
Add a new field to each row containing the sum of specified numeric fields.
Syntax: addtotals [row=<bool>] [col=<bool>] [<field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats sum(`attributes.gen_ai.usage.input_tokens`) as input_tok, sum(`attributes.gen_ai.usage.output_tokens`) as output_tok by serviceName | addtotals"}'
Anomaly detection — identify anomalous values in numeric fields using built-in ML algorithms.
Syntax: ad [time_field=<field>] [number_of_trees=<N>] [shingle_size=<N>] [time_zone=<tz>]
Note: The
adcommand does not take a positional field argument. It auto-detects the input field(s) from the precedingstatsorevaloutput in the pipeline.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | ad time_field=startTime number_of_trees=100 time_zone=\"UTC\" | head 50"}'
Cluster data points using the k-means algorithm.
Syntax: kmeans [centroids=<N>] [iterations=<N>] [distance_type=<type>]
Note: The
kmeanscommand does not take positional field arguments. It operates on all numeric fields from the preceding pipeline output. Usefieldsorevalbeforekmeansto control which numeric fields are used for clustering.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | fields traceId, serviceName, durationInNanos | kmeans centroids=3 | fields traceId, serviceName, durationInNanos, ClusterID | head 30"}'
General ML command for running machine learning algorithms on query results.
Syntax: ml action=<algorithm> [parameters...]
Supported algorithms include: kmeans, ad (anomaly detection).
Note:
ml action=rcfis not a valid action in OpenSearch 3.x PPL. Random Cut Forest anomaly detection is accessed via theadcommand directly (see theadsection above), not throughml action=rcf.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | ml action=kmeans centroids=3 | head 50"}'
Inspect the index mapping and field types for an index.
Syntax: describe <index-pattern>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "describe otel-v1-apm-span-*"}'
Show the query execution plan (used via the _explain API endpoint rather than as an inline command).
Syntax: Use the /_plugins/_ppl/_explain endpoint with the query body.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl/_explain" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | stats count() by serviceName"}'
List all configured data sources available for PPL queries.
Syntax: show datasources
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "show datasources"}'
Execute multiple PPL queries in a single request. Each query is independent.
Syntax: Use the /_plugins/_ppl endpoint with multiple queries separated by newlines (NDJSON format), or execute sequentially.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as total_spans"}'
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | stats count() as total_logs"}'
Format the display of a field's values without changing the underlying data.
Syntax: fieldformat <field> = <format-expression>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000 | fieldformat duration_ms = CONCAT(CAST(duration_ms AS STRING), '\'' ms'\'') | fields traceId, serviceName, duration_ms | head 10"}'
The span() function buckets numeric or datetime values into intervals. Used with stats, timechart, and chart.
Syntax: span(<field>, <interval>)
| Unit | Description | Example |
|---|---|---|
ms | Milliseconds | span(startTime, 500ms) |
s | Seconds | span(startTime, 30s) |
m | Minutes | span(startTime, 5m) |
h | Hours | span(startTime, 1h) |
d | Days | span(startTime, 1d) |
w | Weeks | span(startTime, 1w) |
M | Months | span(startTime, 1M) |
q | Quarters | span(startTime, 1q) |
y | Years | span(startTime, 1y) |
For numeric fields, the interval is a plain number:
stats count() by span(durationInNanos, 1000000000)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count, avg(durationInNanos) as avg_duration by span(startTime, 1h)"}'
Rate functions normalize aggregation values to a per-time-unit rate within timechart:
| Function | Description |
|---|---|
per_second(<agg>) | Aggregation value per second |
per_minute(<agg>) | Aggregation value per minute |
per_hour(<agg>) | Aggregation value per hour |
per_day(<agg>) | Aggregation value per day |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=5m per_second(count()) as requests_per_sec"}'
Used with stats, eventstats, streamstats, timechart, and chart commands.
| Function | Syntax | Description |
|---|---|---|
COUNT | count() | Count of events |
SUM | sum(field) | Sum of numeric values |
AVG | avg(field) | Arithmetic mean |
MAX | max(field) | Maximum value |
MIN | min(field) | Minimum value |
VAR_SAMP | var_samp(field) | Sample variance |
VAR_POP | var_pop(field) | Population variance |
STDDEV_SAMP | stddev_samp(field) | Sample standard deviation |
STDDEV_POP | stddev_pop(field) | Population standard deviation |
DISTINCT_COUNT | distinct_count(field) | Count of distinct values |
PERCENTILE | percentile(field, pct) | Value at the given percentile |
EARLIEST | earliest(field) | Earliest (first chronological) value |
LATEST | latest(field) | Latest (most recent) value |
LIST | list(field) | All values as a list |
VALUES | values(field) | Distinct values as a list |
FIRST | first(field) | First value encountered |
LAST | last(field) | Last value encountered |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as total, avg(durationInNanos) as avg_ns, percentile(durationInNanos, 95) as p95_ns, distinct_count(serviceName) as services"}'
Functions for working with multi-value fields and arrays.
| Function | Syntax | Description |
|---|---|---|
ARRAY | array(val1, val2, ...) | Create an array from values |
SPLIT | split(field, delimiter) | Split a string into an array |
MVJOIN | mvjoin(field, delimiter) | Join multi-value field into a string |
MVCOUNT | mvcount(field) | Count of values in a multi-value field |
MVINDEX | mvindex(field, index) | Get value at index from multi-value field |
MVFIRST | mvfirst(field) | First value of a multi-value field |
MVLAST | mvlast(field) | Last value of a multi-value field |
MVAPPEND | mvappend(field1, field2) | Append two multi-value fields |
MVDEDUP | mvdedup(field) | Remove duplicates from multi-value field |
MVSORT | mvsort(field) | Sort values in a multi-value field |
MVZIP | mvzip(field1, field2, delim) | Zip two multi-value fields together |
MVRANGE | mvrange(start, end, step) | Generate a range of numeric values |
MVFILTER | mvfilter(expression) | Filter values in a multi-value field |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval tokens = array(`attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens`) | fields traceId, tokens | head 10"}'
Functions for conditional logic and null handling.
| Function | Syntax | Description |
|---|---|---|
ISNULL | isnull(field) | Returns true if field is null |
ISNOTNULL | isnotnull(field) | Returns true if field is not null |
IF | if(cond, true_val, false_val) | Conditional expression |
IFNULL | ifnull(field, default) | Return default if field is null |
NULLIF | nullif(val1, val2) | Return null if val1 equals val2 |
CASE | case(cond1, val1, cond2, val2, ..., else_val) | Multi-branch conditional |
COALESCE | coalesce(val1, val2, ...) | First non-null value |
LIKE | field LIKE 'pattern' | Wildcard pattern match (% and _) |
IN | field IN (val1, val2, ...) | Check membership in a set |
BETWEEN | field BETWEEN val1 AND val2 | Range check (inclusive) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval status_label = case(`status.code` = 0, '\''UNSET'\'', `status.code` = 1, '\''OK'\'', `status.code` = 2, '\''ERROR'\'') | stats count() by status_label"}'
Functions for type casting and conversion.
| Function | Syntax | Description |
|---|---|---|
CAST | cast(field AS type) | Cast to a specified type (STRING, INT, LONG, FLOAT, DOUBLE, BOOLEAN, DATE, TIMESTAMP) |
TOSTRING | tostring(field) | Convert to string |
TONUMBER | tonumber(field) | Convert to number |
TOINT | toint(field) | Convert to integer |
TOLONG | tolong(field) | Convert to long |
TOFLOAT | tofloat(field) | Convert to float |
TODOUBLE | todouble(field) | Convert to double |
TOBOOLEAN | toboolean(field) | Convert to boolean |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = CAST(durationInNanos AS DOUBLE) / 1000000.0 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
Functions for computing hash digests.
| Function | Syntax | Description |
|---|---|---|
MD5 | md5(field) | MD5 hash of the value |
SHA1 | sha1(field) | SHA-1 hash of the value |
SHA2 | sha2(field, numBits) | SHA-2 hash (numBits: 224, 256, 384, 512) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval trace_hash = md5(traceId) | fields traceId, trace_hash | head 5"}'
Functions for date and time manipulation.
| Function | Syntax | Description |
|---|---|---|
NOW | now() | Current timestamp |
CURDATE | curdate() | Current date |
CURTIME | curtime() | Current time |
DATE_FORMAT | date_format(date, fmt) | Format a date (%Y-%m-%d %H:%i:%s) |
DATE_ADD | date_add(date, INTERVAL n unit) | Add interval to date |
DATE_SUB | date_sub(date, INTERVAL n unit) | Subtract interval from date |
DATEDIFF | datediff(date1, date2) | Difference in days between two dates |
DAY | day(date) | Day of month (1–31) |
MONTH | month(date) | Month (1–12) |
YEAR | year(date) | Year |
HOUR | hour(time) | Hour (0–23) |
MINUTE | minute(time) | Minute (0–59) |
SECOND | second(time) | Second (0–59) |
DAYOFWEEK | dayofweek(date) | Day of week (1=Sun, 7=Sat) |
DAYOFYEAR | dayofyear(date) | Day of year (1–366) |
WEEK | week(date) | Week number of the year |
UNIX_TIMESTAMP | unix_timestamp(date) | Convert to Unix epoch seconds |
FROM_UNIXTIME | from_unixtime(epoch) | Convert Unix epoch to timestamp |
TIMESTAMPADD | timestampadd(unit, n, ts) | Add interval to timestamp |
TIMESTAMPDIFF | timestampdiff(unit, ts1, ts2) | Difference between timestamps in given unit |
PERIOD_ADD | period_add(period, n) | Add months to a period (YYMM/YYYYMM) |
PERIOD_DIFF | period_diff(p1, p2) | Difference in months between periods |
MAKETIME | maketime(h, m, s) | Create a time value |
MAKEDATE | makedate(year, dayofyear) | Create a date from year and day-of-year |
ADDDATE | adddate(date, INTERVAL n unit) | Alias for DATE_ADD |
SUBDATE | subdate(date, INTERVAL n unit) | Alias for DATE_SUB |
SYSDATE | sysdate() | Current date and time (evaluated at execution) |
UTC_DATE | utc_date() | Current UTC date |
UTC_TIME | utc_time() | Current UTC time |
UTC_TIMESTAMP | utc_timestamp() | Current UTC timestamp |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where startTime > DATE_SUB(NOW(), INTERVAL 1 HOUR) | stats count() as recent_spans by serviceName"}'
Operators for arithmetic, comparison, and logical expressions used in eval, where, and other commands.
| Operator | Description | Example |
|---|---|---|
+ | Addition | eval total = input_tokens + output_tokens |
- | Subtraction | eval gap = endTime - startTime |
* | Multiplication | eval cost = tokens * price_per_token |
/ | Division | eval duration_ms = durationInNanos / 1000000 |
| Operator | Description |
|---|---|
= | Equal to |
!= or <> | Not equal to |
< | Less than |
> | Greater than |
<= | Less than or equal to |
>= | Greater than or equal to |
| Operator | Description |
|---|---|
AND | Logical AND |
OR | Logical OR |
NOT | Logical NOT |
XOR | Logical exclusive OR |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000, total_tokens = `attributes.gen_ai.usage.input_tokens` + `attributes.gen_ai.usage.output_tokens` | where duration_ms > 1000 AND total_tokens > 0 | fields traceId, serviceName, duration_ms, total_tokens | head 10"}'
Functions for IP address operations.
| Function | Syntax | Description |
|---|---|---|
CIDRMATCH | cidrmatch(ip_field, 'cidr') | Check if IP is within a CIDR range |
GEOIP | geoip(ip_field) | Geo-locate an IP address (returns country, region, city, lat/lon) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where isnotnull(`attributes.net.peer.ip`) | where cidrmatch(`attributes.net.peer.ip`, '\''10.0.0.0/8'\'') | fields traceId, `attributes.net.peer.ip`, serviceName | head 10"}'
Functions for working with JSON data.
| Function | Syntax | Description |
|---|---|---|
JSON_EXTRACT | json_extract(field, path) | Extract value at JSON path |
JSON_KEYS | json_keys(field) | Get all keys from a JSON object |
JSON_VALID | json_valid(field) | Check if value is valid JSON |
JSON_ARRAY | json_array(val1, val2, ...) | Create a JSON array |
JSON_OBJECT | json_object(key1, val1, ...) | Create a JSON object |
JSON_ARRAY_LENGTH | json_array_length(field) | Length of a JSON array |
JSON_EXTRACT_PATH_TEXT | json_extract_path_text(field, path) | Extract value as text from JSON path |
TO_JSON_STRING | to_json_string(field) | Convert value to JSON string |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where json_valid(`attributes.gen_ai.tool.call.arguments`) | eval tool_args = json_extract(`attributes.gen_ai.tool.call.arguments`, '\''$'\'') | fields traceId, `attributes.gen_ai.tool.name`, tool_args | head 10"}'
Functions for mathematical operations.
| Function | Syntax | Description |
|---|---|---|
ABS | abs(val) | Absolute value |
CEIL | ceil(val) | Round up to nearest integer |
FLOOR | floor(val) | Round down to nearest integer |
ROUND | round(val [, decimals]) | Round to N decimal places |
SQRT | sqrt(val) | Square root |
POW | pow(base, exp) | Exponentiation |
MOD | mod(a, b) | Modulo (remainder) |
LOG | log(val) | Natural logarithm |
LOG2 | log2(val) | Base-2 logarithm |
LOG10 | log10(val) | Base-10 logarithm |
LN | ln(val) | Natural logarithm (alias for LOG) |
EXP | exp(val) | e raised to the power of val |
SIGN | sign(val) | Sign of value (-1, 0, 1) |
TRUNCATE | truncate(val, decimals) | Truncate to N decimal places |
PI | pi() | Value of π |
E | e() | Value of Euler's number |
RAND | rand([seed]) | Random float between 0 and 1 |
ACOS | acos(val) | Arc cosine |
ASIN | asin(val) | Arc sine |
ATAN | atan(val) | Arc tangent |
ATAN2 | atan2(y, x) | Two-argument arc tangent |
COS | cos(val) | Cosine |
SIN | sin(val) | Sine |
TAN | tan(val) | Tangent |
COT | cot(val) | Cotangent |
DEGREES | degrees(radians) | Convert radians to degrees |
RADIANS | radians(degrees) | Convert degrees to radians |
CONV | conv(val, from_base, to_base) | Convert between number bases |
CRC32 | crc32(val) | CRC-32 checksum |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = round(durationInNanos / 1000000.0, 2) | where duration_ms > 0 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
Full-text search functions for relevance-based querying.
| Function | Syntax | Description |
|---|---|---|
MATCH | match(field, query) | Full-text match on a single field |
MATCH_PHRASE | match_phrase(field, phrase) | Exact phrase match |
MATCH_BOOL_PREFIX | match_bool_prefix(field, query) | Boolean prefix match |
MATCH_PHRASE_PREFIX | match_phrase_prefix(field, prefix) | Phrase prefix match |
MULTI_MATCH | multi_match([field1, field2], query) | Match across multiple fields |
QUERY_STRING | query_string([field1, field2], query) | Lucene query string syntax |
SIMPLE_QUERY_STRING | simple_query_string([field1, field2], query) | Simplified query string |
HIGHLIGHT | highlight(field) | Return highlighted matching fragments |
SCORE | score(relevance_func) | Return relevance score |
SCOREQUERY | scorequery(relevance_func) | Filter by relevance score |
MATCH_QUERY | match_query(field, query) | Alias for MATCH |
WILDCARD_QUERY | wildcard_query(field, pattern) | Wildcard pattern match (* and ?) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | where match(body, '\''timeout error'\'') | fields traceId, severityText, body | head 10"}'
Functions for computing statistical correlations and covariances.
| Function | Syntax | Description |
|---|---|---|
COVAR_POP | covar_pop(field1, field2) | Population covariance |
COVAR_SAMP | covar_samp(field1, field2) | Sample covariance |
Note:
corr()is not a recognized stats aggregation function in OpenSearch 3.x PPL. To approximate Pearson correlation, useevalwith manual calculation or compute covariance and standard deviations separately. Thecovar_sampandcovar_popfunctions are supported.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `attributes.gen_ai.usage.input_tokens` > 0 | stats covar_samp(`attributes.gen_ai.usage.input_tokens`, durationInNanos) as token_duration_covar"}'
Functions for string manipulation.
| Function | Syntax | Description |
|---|---|---|
CONCAT | concat(str1, str2, ...) | Concatenate strings |
LENGTH | length(str) | String length in bytes |
LOWER | lower(str) | Convert to lowercase |
UPPER | upper(str) | Convert to uppercase |
TRIM | trim(str) | Remove leading/trailing whitespace |
LTRIM | ltrim(str) | Remove leading whitespace |
RTRIM | rtrim(str) | Remove trailing whitespace |
SUBSTRING | substring(str, start [, len]) | Extract substring |
LEFT | left(str, len) | Leftmost N characters |
RIGHT | right(str, len) | Rightmost N characters |
REPLACE | replace(str, from, to) | Replace occurrences |
REVERSE | reverse(str) | Reverse a string |
LOCATE | locate(substr, str [, pos]) | Position of substring |
POSITION | position(substr IN str) | Position of substring |
ASCII | ascii(str) | ASCII code of first character |
CHAR_LENGTH | char_length(str) | Character count |
CHARACTER_LENGTH | character_length(str) | Alias for CHAR_LENGTH |
OCTET_LENGTH | octet_length(str) | Byte count |
BIT_LENGTH | bit_length(str) | Bit count |
LPAD | lpad(str, len, pad) | Left-pad to length |
RPAD | rpad(str, len, pad) | Right-pad to length |
SPACE | space(n) | String of N spaces |
REPEAT | repeat(str, n) | Repeat string N times |
STRCMP | strcmp(str1, str2) | Compare strings (-1, 0, 1) |
SUBSTR | substr(str, start [, len]) | Alias for SUBSTRING |
MID | mid(str, start, len) | Alias for SUBSTRING |
FIELD | field(str, val1, val2, ...) | Index of str in value list |
FIND_IN_SET | find_in_set(str, strlist) | Position in comma-separated list |
FORMAT | format(val, decimals) | Format number with commas and decimals |
INSERT | insert(str, pos, len, newstr) | Insert string at position |
INSTR | instr(str, substr) | Position of first occurrence |
REGEXP | regexp(str, pattern) | Regex match (returns 1 or 0) |
REGEXP_EXTRACT | regexp_extract(str, pattern [, group]) | Extract regex capture group |
REGEXP_REPLACE | regexp_replace(str, pattern, replacement) | Replace regex matches |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | eval body_lower = lower(body) | where body_lower like '\''%exception%'\'' | eval short_body = left(body, 200) | fields traceId, severityText, short_body | head 10"}'
| Function | Syntax | Description |
|---|---|---|
TYPEOF | typeof(field) | Returns the data type of a field value |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval type_of_duration = typeof(durationInNanos) | fields traceId, durationInNanos, type_of_duration | head 5"}'
This PPL reference is sourced from the opensearch-project/sql repository's docs/user/ppl/ directory.
Repository: https://github.com/opensearch-project/sql
The PPL grammar is maintained as part of the OpenSearch SQL plugin. For the latest syntax additions and changes, consult the repository documentation directly.
npx claudepluginhub opensearch-project/observability-stack --plugin observabilityQueries OpenSearch logs using PPL for severity filtering, trace correlation, error patterns, and volume analysis in OTEL indices.
Guides Honeycomb queries on trace/event datasets: percentiles over AVG, HEATMAP distributions, relational fields (root.,any.,none.), calculated fields, query math, result interpretation (P99/P50, heatmaps). For latency, errors, outliers, slow requests.
Interact with Elasticsearch and Kibana via curl REST API for querying (Query DSL), indexing, CRUD, index management, mappings, aggregations, cluster health, ILM, ES|QL, dashboards, OpenTelemetry patterns, and troubleshooting.