Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions config/logstash.yml
Original file line number Diff line number Diff line change
Expand Up @@ -282,6 +282,38 @@
#
# path.dead_letter_queue:
#
# ------------ OpenTelemetry Metrics Settings --------------
# Export Logstash metrics to an OpenTelemetry-compatible backend via OTLP.
# This allows you to send Logstash metrics to any OTLP-compatible collector
# or observability platform (e.g., Jaeger, Prometheus via OTel Collector, Grafana Cloud).
Copy link
Copy Markdown
Member

@perk perk Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# or observability platform (e.g., Jaeger, Prometheus via OTel Collector, Grafana Cloud).
# or monitoring platform (e.g., Prometheus via OTLP receiver, Elastic, Grafana Cloud).

I propose we change the observability into monitoring because we talk about metrics only here, and also remove Jaeger because while technically it can ingest metrics it is not a metrics backend.

Moreover, Prometheus has the native OTLP receiver built-in now :)
I also propose to add the Elastic there, to make sure our users know they can send OTLP data metrics directly to Elastic now.

#
# Enable OpenTelemetry metrics export (default: false)
#
# otel.metrics.enabled: false
#
# OTLP endpoint URL. For gRPC, typically port 4317. For HTTP, typically port 4318.
#
# otel.metrics.endpoint: "http://localhost:4317"
#
# Export interval in seconds (default: 10)
#
# otel.metrics.interval: 10
#
# Protocol to use for OTLP export: "grpc" (default) or "http"
#
# otel.metrics.protocol: "grpc"
#
# Authorization header for authenticated OTLP endpoints.
# Examples: "ApiKey xxx" or "Bearer xxx"
#
# otel.metrics.authorization_header:
#
# Additional resource attributes as comma-separated key=value pairs.
# These are attached to all exported metrics to identify this Logstash instance.
# Example: "environment=production,cluster=us-west,team=platform"
#
# otel.resource.attributes:
#
# ------------ Debugging Settings --------------
#
# Options for log.level:
Expand Down
6 changes: 6 additions & 0 deletions docs/reference/logstash-settings-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,4 +99,10 @@
| `path.plugins` | Where to find custom plugins. You can specify this setting multiple times to include multiple paths. Plugins are expected to be in a specific directory hierarchy: `PATH/logstash/TYPE/NAME.rb` where `TYPE` is `inputs`, `filters`, `outputs`, or `codecs`, and `NAME` is the name of the plugin. | Platform-specific. See [Logstash Directory Layout](/reference/dir-layout.md). |
| `allow_superuser` | Setting to `true` to allow or `false` to block running Logstash as a superuser. | `false` |
| `pipeline.buffer.type` | Determine where to allocate memory buffers, for plugins that leverage them.Defaults to `heap` but can be switched to `direct` to instruct Logstash to prefer allocation of buffers in direct memory. | `heap` Check out [Buffer Allocation types](/reference/jvm-settings.md#off-heap-buffers-allocation) for more info. |
| `otel.metrics.enabled` | Enable or disable OpenTelemetry metrics export. See [Monitoring with OpenTelemetry](/reference/monitoring-with-opentelemetry.md). | `false` |

Check notice on line 102 in docs/reference/logstash-settings-file.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.WordChoice: Consider using 'deactivate, deselect, hide, turn off' instead of 'disable', unless the term is in the UI.
| `otel.metrics.endpoint` | The OTLP endpoint URL for metrics export. For gRPC, typically port 4317. For HTTP, typically port 4318. | `http://localhost:4317` |
| `otel.metrics.interval` | Export interval in seconds. Controls how frequently metrics are sent to the OTLP endpoint. | `10` |
| `otel.metrics.protocol` | Protocol to use for OTLP export. Valid values are `grpc` or `http`. | `grpc` |
| `otel.metrics.authorization_header` | Authorization header for authenticated OTLP endpoints. Examples: `ApiKey xxx` or `Bearer xxx`. | *N/A* |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like both OtlpHttpMetricExporterBuilder and OtlpGrpcMetricExporterBuilder have a setTrustedCertificates method that takes a byte-array with PEM-formatted contents; should we add otel.metrics.ssl.certificate_authority to allow a user to configure trust?

Only the OtlpGrpcMetricExporterBuilder has a way of providing self identity; should we add otel.metrics.ssl.certificate and otel.metrics.ssl.key here, and warn if the protocol isn't grpc?

| `otel.resource.attributes` | Additional OpenTelemetry resource attributes as comma-separated key=value pairs. Example: `environment=production,cluster=us-west`. | *N/A* |

6 changes: 4 additions & 2 deletions docs/reference/monitoring-logstash.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@

You can use monitoring APIs provided by Logstash to retrieve these metrics. These APIs are available by default without requiring any extra configuration.

Alternatively, you can [configure Elastic Stack monitoring features](monitoring-logstash-legacy.md) to send
data to a monitoring cluster.
Alternatively, you can:

* [Export metrics via OpenTelemetry](monitoring-with-opentelemetry.md) to send metrics to any OTLP-compatible backend, including Elastic Cloud's native OTLP endpoint.

Check warning on line 23 in docs/reference/monitoring-logstash.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Latinisms: Latin terms and abbreviations are a common source of confusion. Use 'using' instead of 'via'.
* [Configure Elastic Stack monitoring features](monitoring-logstash-legacy.md) to send data to a monitoring cluster.

## APIs for monitoring Logstash [monitoring]

Expand Down
212 changes: 212 additions & 0 deletions docs/reference/monitoring-with-opentelemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
---
mapped_pages:
- https://www.elastic.co/guide/en/logstash/current/monitoring-with-opentelemetry.html
applies_to:
stack: preview
---

# Monitoring Logstash with OpenTelemetry

Logstash can export metrics to any OpenTelemetry Protocol (OTLP) compatible backend, enabling integration with observability platforms like Elastic, Prometheus, etc.

Check warning on line 10 in docs/reference/monitoring-with-opentelemetry.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Latinisms: Latin terms and abbreviations are a common source of confusion. Use 'and so on' instead of 'etc'.

## Overview

The OpenTelemetry metrics exporter sends Logstash runtime metrics directly via OTLP (OpenTelemetry Protocol). This provides a standardized way to collect and export metrics without requiring an intermediate collector, though you can also route metrics through an OpenTelemetry Collector if needed.

Check warning on line 14 in docs/reference/monitoring-with-opentelemetry.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Latinisms: Latin terms and abbreviations are a common source of confusion. Use 'using' instead of 'via'.

## Configuration

To enable OpenTelemetry metrics export, add the following settings to your `logstash.yml` file:

```yaml
otel.metrics.enabled: true
otel.metrics.endpoint: "http://localhost:4317"
otel.metrics.interval: 10
otel.metrics.protocol: "grpc"
```

### Settings

| Setting | Description | Default |
| --- | --- | --- |
| `otel.metrics.enabled` | Enable or disable OpenTelemetry metrics export. | `false` |

Check notice on line 31 in docs/reference/monitoring-with-opentelemetry.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.WordChoice: Consider using 'deactivate, deselect, hide, turn off' instead of 'disable', unless the term is in the UI.
| `otel.metrics.endpoint` | The OTLP endpoint URL. For gRPC, typically port 4317. For HTTP, typically port 4318. | `http://localhost:4317` |
| `otel.metrics.interval` | Export interval in seconds. Controls how frequently metrics are sent to the endpoint. | `10` |
| `otel.metrics.protocol` | Protocol to use for OTLP export. Valid values are `grpc` or `http`. | `grpc` |
| `otel.metrics.authorization_header` | Authorization header for authenticated endpoints. Examples: `ApiKey xxx` or `Bearer xxx`. | *N/A* |
| `otel.resource.attributes` | Additional resource attributes as comma-separated key=value pairs. Example: `environment=production,cluster=us-west`. | *N/A* |

## Sending metrics to Elastic Cloud

To send metrics directly to Elastic Cloud's native OTLP endpoint:

1. Get your Elastic Cloud OTLP endpoint from your deployment's APM integration settings
2. Create an API key with appropriate permissions
3. Configure Logstash:

```yaml
otel.metrics.enabled: true
otel.metrics.endpoint: "https://your-deployment.apm.us-central1.gcp.cloud.es.io:443"
otel.metrics.protocol: "http"
otel.metrics.authorization_header: "ApiKey your-base64-encoded-api-key"
```

## Sending metrics to an OpenTelemetry Collector

You can also send metrics to an OpenTelemetry Collector, which can then forward them to multiple backends:

```yaml
otel.metrics.enabled: true
otel.metrics.endpoint: "http://otel-collector:4317"
otel.metrics.protocol: "grpc"
```

Example OpenTelemetry Collector configuration to forward to Elasticsearch:

```yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318

processors:
batch:

exporters:
elasticsearch:
endpoints: ["https://your-elasticsearch-host:9200"]
api_key: "your-api-key"
mapping:
mode: otel

service:
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [elasticsearch]
```

## Exported metrics

Logstash exports the following metrics via OpenTelemetry:

Check warning on line 94 in docs/reference/monitoring-with-opentelemetry.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Latinisms: Latin terms and abbreviations are a common source of confusion. Use 'using' instead of 'via'.

### Global metrics

| Metric name | Type | Unit | Description |
| --- | --- | --- | --- |
| `logstash.events.in` | Counter | `{event}` | Total events received across all pipelines |
| `logstash.events.out` | Counter | `{event}` | Total events output across all pipelines |
| `logstash.events.filtered` | Counter | `{event}` | Total events filtered across all pipelines |
| `logstash.queue.events` | Gauge | `{event}` | Total events currently in queues |

### Pipeline metrics

Pipeline metrics include a `pipeline.id` attribute to identify the pipeline.

| Metric name | Type | Unit | Description |
| --- | --- | --- | --- |
| `logstash.pipeline.events.in` | Counter | `{event}` | Events received by pipeline |
| `logstash.pipeline.events.out` | Counter | `{event}` | Events output by pipeline |
| `logstash.pipeline.events.filtered` | Counter | `{event}` | Events filtered by pipeline |
| `logstash.pipeline.queue.events` | Gauge | `{event}` | Events in pipeline queue |

### Persistent queue metrics

These metrics are available when using persistent queues (`queue.type: persisted`).

| Metric name | Type | Unit | Description |
| --- | --- | --- | --- |
| `logstash.pipeline.queue.capacity.page_capacity` | Gauge | `By` | Size of each queue page in bytes |
| `logstash.pipeline.queue.capacity.max_size` | Gauge | `By` | Maximum queue size limit in bytes |
| `logstash.pipeline.queue.capacity.max_unread_events` | Gauge | `{event}` | Maximum unread events allowed |
| `logstash.pipeline.queue.capacity.size` | Gauge | `By` | Current persisted queue size in bytes |
| `logstash.pipeline.queue.data.free_space` | Gauge | `By` | Free disk space where queue is stored |

### Dead letter queue metrics

| Metric name | Type | Unit | Description |
| --- | --- | --- | --- |
| `logstash.pipeline.dlq.queue_size` | Gauge | `By` | Current dead letter queue size in bytes |
| `logstash.pipeline.dlq.max_queue_size` | Gauge | `By` | Maximum DLQ size limit in bytes |
| `logstash.pipeline.dlq.dropped_events` | Gauge | `{event}` | Events dropped when DLQ is full |
| `logstash.pipeline.dlq.expired_events` | Gauge | `{event}` | Events expired and removed from DLQ |

### Plugin metrics

Plugin metrics include `pipeline.id`, `plugin.type`, and `plugin.id` attributes.

| Metric name | Type | Unit | Description |
| --- | --- | --- | --- |
| `logstash.plugin.events.in` | Counter | `{event}` | Events received by plugin |
| `logstash.plugin.events.out` | Counter | `{event}` | Events output by plugin |
| `logstash.plugin.events.duration` | Counter | `ms` | Time spent processing events |

### Cgroup metrics (Linux only)

These metrics are available when running on Linux with cgroups enabled (e.g., in containers).

Check warning on line 149 in docs/reference/monitoring-with-opentelemetry.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Latinisms: Latin terms and abbreviations are a common source of confusion. Use 'for example' instead of 'e.g'.

| Metric name | Type | Unit | Description |
| --- | --- | --- | --- |
| `logstash.os.cgroup.cpuacct.usage` | Counter | `ns` | Total CPU time consumed |
| `logstash.os.cgroup.cpu.cfs_period` | Gauge | `us` | CFS scheduling period |
| `logstash.os.cgroup.cpu.cfs_quota` | Gauge | `us` | CFS scheduling quota |
| `logstash.os.cgroup.cpu.stat.elapsed_periods` | Counter | `{period}` | Number of elapsed CFS periods |
| `logstash.os.cgroup.cpu.stat.nr_times_throttled` | Counter | `{occurrence}` | Number of times throttled |
| `logstash.os.cgroup.cpu.stat.time_throttled` | Counter | `ns` | Total time throttled |

## Resource attributes

The following resource attributes are automatically added to all metrics:

| Attribute | Description |
| --- | --- |
| `service.name` | Always set to `logstash` |
| `service.instance.id` | The Logstash node ID |
| `service.version` | The Logstash version |
| `host.name` | The configured node name |

Additional resource attributes can be added using the `otel.resource.attributes` setting.

## Viewing metrics in Kibana

When sending metrics to Elastic Cloud via the native OTLP endpoint, metrics are stored in APM data streams (`.ds-metrics-apm.app.logstash-*`). You can view them in:

Check warning on line 175 in docs/reference/monitoring-with-opentelemetry.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Latinisms: Latin terms and abbreviations are a common source of confusion. Use 'using' instead of 'via'.

1. **Observability > APM > Services** - Find your Logstash service
2. **Observability > Metrics Explorer** - Query metrics directly
3. **Discover** - Search the `metrics-apm.app.logstash-*` data view

When using an OpenTelemetry Collector with the Elasticsearch exporter, create a data view matching your configured index pattern (e.g., `metrics-otel-*`).

Check warning on line 181 in docs/reference/monitoring-with-opentelemetry.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Latinisms: Latin terms and abbreviations are a common source of confusion. Use 'for example' instead of 'e.g'.

## Troubleshooting

### Enable debug logging

To see detailed OpenTelemetry SDK logs, add the following to `config/log4j2.properties`:

```properties
logger.otel.name = io.opentelemetry
logger.otel.level = debug
```

### Common issues

**Connection refused errors**

Verify the endpoint is accessible:
- For gRPC (default): Port 4317
- For HTTP: Port 4318 with `/v1/metrics` path automatically appended

**Authentication errors**

Ensure the `otel.metrics.authorization_header` is correctly formatted:
- For API keys: `ApiKey base64-encoded-key`
- For Bearer tokens: `Bearer your-token`

**Metrics not appearing**

- Check that `otel.metrics.enabled` is set to `true`
- Verify the export interval hasn't been set too high
- Check Logstash logs for export errors
1 change: 1 addition & 0 deletions docs/reference/toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ toc:
- file: logstash-pipeline-viewer.md
- file: monitoring-troubleshooting.md
- file: monitoring-logstash.md
- file: monitoring-with-opentelemetry.md
- file: working-with-plugins.md
children:
- file: plugin-concepts.md
Expand Down
8 changes: 8 additions & 0 deletions logstash-core/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -259,4 +259,12 @@ dependencies {
api group: 'org.apache.httpcomponents', name: 'httpclient', version: '4.5.14'
api group: 'commons-codec', name: 'commons-codec', version: '1.17.0'
api group: 'org.apache.httpcomponents', name: 'httpcore', version: '4.4.16'

// OpenTelemetry SDK for metrics export
def otelVersion = '1.59.0'
implementation platform("io.opentelemetry:opentelemetry-bom:${otelVersion}")
implementation 'io.opentelemetry:opentelemetry-api'
implementation 'io.opentelemetry:opentelemetry-sdk'
implementation 'io.opentelemetry:opentelemetry-sdk-metrics'
implementation 'io.opentelemetry:opentelemetry-exporter-otlp'
}
4 changes: 2 additions & 2 deletions logstash-core/lib/logstash/agent.rb
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ class LogStash::Agent
include LogStash::Util::Loggable
STARTED_AT = Time.now.freeze

attr_reader :metric, :name, :settings, :dispatcher, :ephemeral_id, :pipeline_bus
attr_reader :metric, :name, :settings, :dispatcher, :ephemeral_id, :pipeline_bus, :pipelines_registry
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use Agent#running_pipelines the expose of the registry is not necessary.

attr_accessor :logger

attr_reader :health_observer
Expand Down Expand Up @@ -518,7 +518,7 @@ def configure_metrics_collectors
LogStash::Instrument::NullMetric.new(@collector)
end

@periodic_pollers = LogStash::Instrument::PeriodicPollers.new(@metric, settings.get("queue.type"), self)
@periodic_pollers = LogStash::Instrument::PeriodicPollers.new(@metric, settings, self)
@periodic_pollers.start
end

Expand Down
8 changes: 7 additions & 1 deletion logstash-core/lib/logstash/environment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,13 @@ def self.as_java_range(r)
Setting::StringSetting.new("keystore.classname", "org.logstash.secret.store.backend.JavaKeyStore"),
Setting::StringSetting.new("keystore.file", ::File.join(::File.join(LogStash::Environment::LOGSTASH_HOME, "config"), "logstash.keystore"), false), # will be populated on
Setting::NullableStringSetting.new("monitoring.cluster_uuid"),
Setting::StringSetting.new("pipeline.buffer.type", "heap", true, ["direct", "heap"])
Setting::StringSetting.new("pipeline.buffer.type", "heap", true, ["direct", "heap"]),
Setting::BooleanSetting.new("otel.metrics.enabled", false),
Setting::StringSetting.new("otel.metrics.endpoint", "http://localhost:4317"),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we infer the port if it isn't specified here, using the value of otel.metrics.protocol to provide a reasonable default?

Can we validate the shape of this early, since we know that downstream accepts only http or https?

Why doesn't this include the /v1/metrics part that is hard-coded for the http protocol?

Setting::NumericSetting.new("otel.metrics.interval", 10), # seconds
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have Setting::TimeValueSetting -- should we use this (with a minimum value to ensure users don't do something silly like having an interval of 1ns).

Setting::StringSetting.new("otel.metrics.protocol", "grpc", true, ["grpc", "http"]),
Setting::NullableStringSetting.new("otel.metrics.authorization_header", nil, false), # e.g., "ApiKey xxx" or "Bearer xxx"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a password type to prevent the value from leaking into logs etc?

Setting::NullableStringSetting.new("otel.resource.attributes", nil, false) # key=value,key2=value2 format
# post_process
].each {|setting| SETTINGS.register(setting) }

Expand Down
29 changes: 15 additions & 14 deletions logstash-core/lib/logstash/instrument/periodic_poller/os.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,25 +25,26 @@ def initialize(metric, options = {})
end

def collect
collect_cgroup
self.class.collect_cgroup(metric)
end

def collect_cgroup
if stats = Cgroup.get
save_metric([:os], :cgroup, stats)
class << self
def collect_cgroup(metric)
if stats = Cgroup.get
save_metric(metric, [:os], :cgroup, stats)
end
end
end

# Recursive function to create the Cgroups values form the created hash
def save_metric(namespace, k, v)
if v.is_a?(Hash)
v.each do |new_key, new_value|
n = namespace.dup
n << k.to_sym
save_metric(n, new_key, new_value)
def save_metric(metric, namespace, k, v)
if v.is_a?(Hash)
v.each do |new_key, new_value|
n = namespace.dup
n << k.to_sym
save_metric(metric, n, new_key, new_value)
end
else
metric.gauge(namespace, k.to_sym, v)
end
else
metric.gauge(namespace, k.to_sym, v)
end
end
end
Expand Down
Loading
Loading