Skip to content

Investigate distribution metric reporting issues (#20)#32

Open
alco wants to merge 2 commits intomainfrom
investigate/distribution-metric-issues
Open

Investigate distribution metric reporting issues (#20)#32
alco wants to merge 2 commits intomainfrom
investigate/distribution-metric-issues

Conversation

@alco
Copy link
Copy Markdown
Member

@alco alco commented Mar 20, 2026

Summary

Investigation into whether telemetry_metrics.distribution produces incorrect results, as reported in #20.

Confirmed issues:

  1. Min/max never populated: The HistogramDataPoint proto has optional min and max fields, but the exporter never sets them. Downstream consumers (e.g., Honeycomb) default these to 0, making min/max charts useless.

  2. Values rounded to integers: write_metric/5 uses round(value) for the ETS counter update, causing precision loss for fractional measurements. This error accumulates across multiple values.

  3. Percentile estimation is bucket-limited (expected behavior, not a bug): Since the exporter sends standard OTLP histogram data (bucket_counts + explicit_bounds), downstream consumers can only estimate percentiles from bucket boundaries. A value of 668 in the (500, 750] bucket causes Honeycomb to report all percentiles as 500.

Root causes in code:

  • metric_store.ex:107update_sum_op = {3, round(value)} rounds values
  • metric_store.ex:326-334convert_data/2 constructs HistogramDataPoint without setting min or max fields
  • metric_store.ex:102-115write_metric/5 only tracks {count, sum} per bucket in ETS, no min/max tracking

Test plan

  • 9 new tests in distribution_investigation_test.exs all pass
  • Tests reproduce both scenarios from issue telemetry_metrics.distribution is not reporting sensible results #20 (single value and two values)
  • Tests confirm min/max are nil in exported protobuf
  • Tests confirm rounding behavior with fractional values
  • Tests verify correct bucket assignment

🤖 Generated with Claude Code

erik-the-implementer and others added 2 commits March 20, 2026 16:49
Task 2026-03-20--5--distribution-metric-investigation: investigating
whether telemetry_metrics.distribution reports sensible results
(#20).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests confirm three issues with distribution metric export:

1. Min/max fields are never populated in the exported HistogramDataPoint.
   The OTLP proto has optional min/max fields but the exporter never sets
   them. Downstream consumers (e.g. Honeycomb) default these to 0.

2. Values are rounded to integers via round(value) before accumulation,
   causing precision loss for fractional measurements.

3. Bucket assignment is correct, but percentile estimation by downstream
   consumers is inherently limited by bucket granularity. A value of 668
   in the (500, 750] bucket causes all percentiles to be reported as 500.

All 9 tests pass and reproduce the scenarios described in the issue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4a5fb6f9e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +57 to +61
assert histogram_data_point.min == nil,
"Expected min to be nil (not tracked), got: #{inspect(histogram_data_point.min)}"

assert histogram_data_point.max == nil,
"Expected max to be nil (not tracked), got: #{inspect(histogram_data_point.max)}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop asserting that histogram min/max stay unset

Because this file is a normal mix test test, these assertions turn the current bug into the expected contract: any future fix for issue #20 that starts populating HistogramDataPoint.min/max will fail CI even though the exporter is behaving correctly. If this is meant as one-off investigation code, it should stay out of the regular suite or assert the desired post-fix behavior instead.

Useful? React with 👍 / 👎.

Comment on lines +110 to +111
assert histogram_data_point.sum == 669.0,
"Expected sum to be 669.0 (rounded from 668.7), got: #{histogram_data_point.sum}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not encode rounded distribution sums as the expected result

This test currently blesses the precision-loss bug by expecting 668.7 to export as 669.0. Once write_metric/5 is corrected to preserve fractional measurements, the new test suite will reject the fix and keep CI pinned to the broken behavior. The same issue repeats in the multi-value rounding test below.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants