Skip to content

OpenTelemetry: metrics for WebFinger and actor discovery#772

Merged
dahlia merged 16 commits into
fedify-dev:mainfrom
dahlia:otel/webfinger-actor-discovery
May 20, 2026
Merged

OpenTelemetry: metrics for WebFinger and actor discovery#772
dahlia merged 16 commits into
fedify-dev:mainfrom
dahlia:otel/webfinger-actor-discovery

Conversation

@dahlia
Copy link
Copy Markdown
Member

@dahlia dahlia commented May 20, 2026

Closes #739. Part of #316 (umbrella metrics issue) and builds on the meter-provider plumbing landed via #619.

What

Three new counter and histogram pairs covering the WebFinger and actor handle discovery paths:

Metric pair Covers
webfinger.lookup and webfinger.lookup.duration outgoing lookupWebFinger()
webfinger.handle and webfinger.handle.duration inbound WebFinger requests handled by Federation.fetch()
activitypub.actor.discovery and activitypub.actor.discovery.duration getActorHandle() actor handle discovery

Each family carries a bounded result attribute so dashboards can slice discovery failures by terminal outcome: found/not_found/invalid/network_error/error for outgoing lookups; resolved/invalid/not_found/tombstoned/error for incoming requests; resolved/not_found/error for actor discovery. webfinger.resource.scheme is bucketed to acct, http, https, mailto, or other so an attacker-controlled query string cannot inflate metric cardinality. activitypub.remote.host records the URL hostname only. Full resource URIs, lookup URLs, and handle strings stay on the corresponding spans for trace-level investigation.

lookupWebFinger() and getActorHandle() follow the opt-in lookupObject() pattern: omitting meterProvider emits no measurement. Passing meterProvider to createFederation() wires up webfinger.handle and the federation-bound Context.lookupWebFinger() automatically. Direct getActorHandle() calls remain opt-in through GetActorHandleOptions.meterProvider, and the option is forwarded into the nested WebFinger lookups so a single discovery emits both the discovery measurement and the underlying webfinger.lookup measurement(s). lookupObject() now also forwards meterProvider into its WebFinger fallback so the same nested coverage applies to handle-driven object lookups.

Why

Discovery failures are hard to diagnose from user reports alone. A server can appear broken when WebFinger lookups are slow, blocked by a remote firewall, or returning malformed resource descriptors. Spans capture individual lookups well, but operators need aggregate counters and histograms to graph reliability and latency without sampling.

Notable design decisions

getActorHandle() previously threw a plain TypeError from two places: the explicit "actor lacks enough information" sentinel, and from malformed remote data flowing through new URL(...) and normalizeActorHandle(). Classifying every TypeError as not_found would mismark malformed-data failures.

The sentinel is now a private ActorHandleNotFoundError extends TypeError subclass. It keeps the documented throws {TypeError} API contract intact while letting the metric distinguish "real not found" from "other failure" precisely.

The outgoing lookupWebFingerInternal was refactored to return a typed { resource, result, statusCode?, remoteHost? } outcome rather than just ResourceDescriptor | null. The refactor lets each failure branch return the right bounded result, and a try/catch around the Location header URL constructor preserves result=invalid plus http.response.status_code even when a redirect target is unparseable.

Both webfinger.handle and webfinger.lookup bucket the scheme attribute through the same allow list. The corresponding spans keep the raw scheme value for trace-level investigation; only the metric attribute is bucketed.

Test plan

  • packages/webfinger/src/lookup.test.ts: 13 new steps covering each webfinger.lookup.result bucket, redirect-path classifications, scheme bucketing, and meterProvider opt-in behavior.
  • packages/vocab/src/actor.test.ts: 6 new steps covering each activitypub.actor.discovery.result bucket, nested webfinger.lookup propagation, the malformed-alias regression, and meterProvider opt-in behavior.
  • packages/vocab/src/lookup.test.ts: 1 new step asserting lookupObject() forwards meterProvider into its WebFinger fallback.
  • packages/fedify/src/federation/metrics.test.ts: 3 new unit tests for recordWebFingerHandle().
  • packages/fedify/src/federation/webfinger.test.ts: 5 new steps covering each webfinger.handle.result bucket, scheme bucketing, and meterProvider opt-in behavior.
  • mise run check passes.
  • mise run test:deno and mise run test:node pass.
  • mise run docs:build passes.

Docs and changelog

docs/manual/opentelemetry.md gains rows in the metrics table, per-metric attribute subsections (mirroring the existing activitypub.key.lookup/activitypub.object.lookup style), and three new rows in the semantic attributes table. CHANGES.md gets one new entry under the in-progress Version 2.3.0 @fedify/fedify section.

@dahlia dahlia added this to the Fedify 2.3 milestone May 20, 2026
@dahlia dahlia self-assigned this May 20, 2026
@dahlia dahlia added component/federation Federation object related component/actor Actor dispatcher related component/webfinger WebFinger related component/otel OpenTelemetry integration labels May 20, 2026
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

@codex review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds opt-in OpenTelemetry counters and duration histograms for outgoing WebFinger lookups, incoming WebFinger handling, and actor discovery; it threads a MeterProvider through call paths, introduces bounded result/scheme enums, per-MeterProvider instrument caching, recording helpers, tests, and docs/changelog updates.

Changes

OpenTelemetry Metrics for WebFinger and Actor Discovery

Layer / File(s) Summary
Docs and changelog updates
CHANGES.md, docs/manual/opentelemetry.md
Adds changelog/manual entries documenting new webfinger.lookup, webfinger.handle, and activitypub.actor.discovery metrics, their result enums, scheme bucketing, and semantic attributes.
WebFinger lookup metrics (outgoing)
packages/webfinger/src/lookup.ts, packages/webfinger/src/lookup.test.ts
Refactors outgoing lookup to return structured WebFingerLookupOutcome, adds WebFingerLookupResult, scheme bucketing, per-MeterProvider cached counter + histogram, and records webfinger.lookup/webfinger.lookup.duration. Tests cover success, errors, redirects, invalid inputs, and omission when meterProvider is absent.
Federation metrics infrastructure (incoming)
packages/fedify/src/federation/metrics.ts, packages/fedify/src/federation/metrics.test.ts
Adds WebFingerHandleResult and WebFingerResourceScheme types, WebFingerHandleAttributes, new webfinger.handle counter and webfinger.handle.duration histogram to FederationMetrics, recordWebFingerHandle implementation, and unit tests for recording behavior.
Federation middleware meterProvider wiring
packages/fedify/src/federation/middleware.ts
Propagates the federation instance meterProvider into handleWebFinger and lookupWebFinger call paths.
WebFinger handle metrics integration
packages/fedify/src/federation/webfinger.ts, packages/fedify/src/federation/webfinger.test.ts
handleWebFinger() measures duration, buckets resource scheme, classifies outcome by response status, optionally wraps in OTEL span, and records webfinger.handle metrics in finally; tests validate outcomes, scheme bucketing, and suppression without meterProvider.
Actor discovery metrics
packages/vocab/src/actor.ts, packages/vocab/src/actor.test.ts
Adds ActorDiscoveryResult, ActorHandleNotFoundError, discovery counter + duration histogram cached per MeterProvider, threads meterProvider through discovery/WebFinger calls, records activitypub.actor.discovery metrics, and tests discovery+propagation behaviors.
Object lookup meterProvider propagation
packages/vocab/src/lookup.ts, packages/vocab/src/lookup.test.ts
Forwards meterProvider through lookupObjectInternal so nested WebFinger lookups can emit metrics; test asserts webfinger.lookup measurement and attributes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • fedify-dev/fedify#755: Shares meterProvider plumbing and related federation middleware changes used by this PR.
  • fedify-dev/fedify#771: Related propagation of MeterProvider into lookupObject/lookupWebFinger paths and complementary lookup instrumentation.
  • fedify-dev/fedify#680: Related handling of tombstone→410 behavior used by the incoming WebFinger metrics.

Suggested labels

type/enhancement

Suggested reviewers

  • sij411
  • 2chanhaeng
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.81% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'OpenTelemetry: metrics for WebFinger and actor discovery' clearly and concisely summarizes the main change in the changeset.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, documenting the three metric pairs, design decisions, and test coverage.
Linked Issues check ✅ Passed The PR comprehensively addresses all coding requirements from #739: instruments WebFinger lookups, incoming WebFinger handling, and actor discovery with bounded result attributes, buckets resource schemes and records host-only attributes, includes http.response.status_code when available, covers success and failure paths with extensive tests, and updates documentation.
Out of Scope Changes check ✅ Passed All changes are directly in scope: metric implementations for WebFinger lookup/handle/actor discovery, documentation updates for opentelemetry.md and CHANGES.md, and comprehensive test coverage across all affected modules.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/webfinger/src/lookup.ts (1)

274-283: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject malformed acct: authorities before building the lookup URL.

acct: parsing currently accepts authorities that still contain path/query/fragment characters, which can produce unintended outbound paths and classify malformed input as not_found/error instead of invalid.

Suggested fix
   if (resource.protocol === "acct:") {
     const atPos = resource.pathname.lastIndexOf("@");
     if (atPos < 0) return { resource: null, result: "invalid" };
     server = resource.pathname.substring(atPos + 1);
-    if (server === "") return { resource: null, result: "invalid" };
+    if (server === "" || /[/?#\s]/.test(server)) {
+      return { resource: null, result: "invalid" };
+    }
   } else {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/webfinger/src/lookup.ts` around lines 274 - 283, When parsing acct:
authorities (the atPos/server extraction using resource.pathname.lastIndexOf and
server = resource.pathname.substring(atPos + 1)), validate the extracted server
string does not contain any path/query/fragment characters (e.g. '/', '?', '#')
and is non-empty; if it contains any of those characters return { resource:
null, result: "invalid" } instead of proceeding to build the lookup URL. Apply
this check before constructing the URL (the new
URL(`${protocol}//${server}/.well-known/webfinger`) and setting
url.searchParams) so malformed acct: inputs are rejected as invalid.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/fedify/src/federation/metrics.ts`:
- Around line 376-387: WebFingerHandleAttributes currently exposes scheme as an
unconstrained string and recordWebFingerHandle forwards it directly; change the
metric API boundary to enforce a bounded set of schemes (e.g., a string union or
enum like "http" | "https" | "acct" | "mailto" | "other") and update
recordWebFingerHandle to validate/sanitize webfinger.resource.scheme against
that set, mapping unknown values to a single fallback bucket (e.g., "other" or
"unknown") before recording the metric to preserve bounded cardinality; apply
the same validation/sanitization wherever WebFingerHandleAttributes.scheme is
forwarded.

---

Outside diff comments:
In `@packages/webfinger/src/lookup.ts`:
- Around line 274-283: When parsing acct: authorities (the atPos/server
extraction using resource.pathname.lastIndexOf and server =
resource.pathname.substring(atPos + 1)), validate the extracted server string
does not contain any path/query/fragment characters (e.g. '/', '?', '#') and is
non-empty; if it contains any of those characters return { resource: null,
result: "invalid" } instead of proceeding to build the lookup URL. Apply this
check before constructing the URL (the new
URL(`${protocol}//${server}/.well-known/webfinger`) and setting
url.searchParams) so malformed acct: inputs are rejected as invalid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c716e5e8-305e-4c77-9a82-508064d8ac2a

📥 Commits

Reviewing files that changed from the base of the PR and between 553b59b and 3ca1413.

📒 Files selected for processing (13)
  • CHANGES.md
  • docs/manual/opentelemetry.md
  • packages/fedify/src/federation/metrics.test.ts
  • packages/fedify/src/federation/metrics.ts
  • packages/fedify/src/federation/middleware.ts
  • packages/fedify/src/federation/webfinger.test.ts
  • packages/fedify/src/federation/webfinger.ts
  • packages/vocab/src/actor.test.ts
  • packages/vocab/src/actor.ts
  • packages/vocab/src/lookup.test.ts
  • packages/vocab/src/lookup.ts
  • packages/webfinger/src/lookup.test.ts
  • packages/webfinger/src/lookup.ts

Comment thread packages/fedify/src/federation/metrics.ts
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry metrics for WebFinger and actor handle discovery paths. It adds counters and histograms for outgoing WebFinger lookups (webfinger.lookup), incoming WebFinger requests (webfinger.handle), and actor handle discovery attempts (activitypub.actor.discovery). The implementation includes bucketing of resource schemes to maintain bounded cardinality and introduces specific result taxonomies for each metric family. Documentation and comprehensive unit tests for these new metrics have also been added. I have no feedback to provide.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 95.85492% with 16 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
packages/fedify/src/federation/webfinger.ts 86.36% 8 Missing and 4 partials ⚠️
packages/vocab/src/actor.ts 97.72% 0 Missing and 2 partials ⚠️
packages/webfinger/src/lookup.ts 98.72% 0 Missing and 2 partials ⚠️
Files with missing lines Coverage Δ
packages/fedify/src/federation/metrics.ts 98.97% <100.00%> (+0.08%) ⬆️
packages/fedify/src/federation/middleware.ts 96.26% <100.00%> (+<0.01%) ⬆️
packages/vocab/src/lookup.ts 96.39% <100.00%> (+0.01%) ⬆️
packages/vocab/src/actor.ts 94.57% <97.72%> (+1.81%) ⬆️
packages/webfinger/src/lookup.ts 94.53% <98.72%> (+4.15%) ⬆️
packages/fedify/src/federation/webfinger.ts 91.11% <86.36%> (+0.96%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

dahlia added a commit to dahlia/fedify-fork that referenced this pull request May 20, 2026
WebFingerHandleAttributes.scheme was an unconstrained string.  The
only producer (computeResourceScheme() in webfinger.ts) already
buckets the scheme to a 5-value allow list, so the runtime metric
was cardinality-safe today.  But every sibling helper in metrics.ts
(recordKeyLookup, recordSignatureVerificationDuration, and so on)
constrains its bounded metric attributes via literal-union types,
and `scheme: string` was the only outlier.  A future caller could
have regressed the cardinality guarantee without the compiler
flagging it.

Adds an exported WebFingerResourceScheme literal union of "acct" |
"http" | "https" | "mailto" | "other", narrows
WebFingerHandleAttributes.scheme to it, narrows
computeResourceScheme()'s return type, and uses a type predicate
(isAllowedResourceScheme) so TypeScript infers the bounded type
without runtime casts.  No runtime re-sanitization inside
recordWebFingerHandle(); the type signature matches the contract
the other Record* helpers in this file already use.

fedify-dev#772 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

/gemini review

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/webfinger/src/lookup.ts`:
- Around line 277-282: The current validation only checks the extracted server
string (variable server) for path/query/fragment chars but misses query/hash on
the parsed resource; update the guard in the lookup routine that returns {
resource: null, result: "invalid" } to also check resource.search and
resource.hash (or equivalent parsed URL properties) and reject when either is
non-empty, and add regression tests alongside the existing path-component test
covering acct:user@host?x=1 and acct:user@host#frag to assert they are rejected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: da968b05-627c-4cda-91c4-37c069d32c69

📥 Commits

Reviewing files that changed from the base of the PR and between 3ca1413 and 1916e70.

📒 Files selected for processing (4)
  • packages/fedify/src/federation/metrics.ts
  • packages/fedify/src/federation/webfinger.ts
  • packages/webfinger/src/lookup.test.ts
  • packages/webfinger/src/lookup.ts

Comment thread packages/webfinger/src/lookup.ts Outdated
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

@codex review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements OpenTelemetry metrics for WebFinger and actor handle discovery, adding counters and histograms for webfinger.lookup, webfinger.handle, and activitypub.actor.discovery. The implementation ensures metric cardinality safety by bucketing resource schemes and provides detailed outcome classifications. Additionally, it strengthens acct: URI validation to comply with RFC 7565 by rejecting authorities containing path, query, or fragment components. I have no feedback to provide.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

dahlia added a commit to dahlia/fedify-fork that referenced this pull request May 20, 2026
An earlier commit on this branch added a regex check over the
substring extracted from `pathname` to reject `acct:user@host/path`,
but missed query and fragment cases.  The WHATWG URL parser routes
`?…` into `resource.search` and `#…` into `resource.hash`, not into
`pathname`, so `acct:user@host?x=1` and `acct:user@host#frag` still
passed the guard and would forward to a remote WebFinger lookup
despite being RFC 7565-malformed (the acct: authority is bare host:
no path, query, or fragment allowed).

The fix gates on `resource.search !== ""` and `resource.hash !== ""`
in addition to the existing extracted-server regex.  The three
RFC-disallowed components have to be checked independently because
the URL parser lands them in different URL fields; the comment above
the guard now spells this out.  Adds two regression cases in the
existing "invalid resource" step.

fedify-dev#772 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive OpenTelemetry metrics for WebFinger and actor handle discovery paths. It adds new counters and histograms for outgoing WebFinger lookups, incoming WebFinger requests, and actor handle discovery attempts, featuring detailed result classifications and cardinality-safe resource scheme bucketing. The implementation includes updates to the core federation logic, vocabulary handling, and the WebFinger client, including the introduction of a specific ActorHandleNotFoundError for better error classification. These changes are supported by updated documentation and extensive unit tests. I have no feedback to provide.

@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

dahlia added 6 commits May 20, 2026 15:57
Adds `meterProvider` option to `lookupWebFinger()` so each call can
emit one `webfinger.lookup` counter and one `webfinger.lookup.duration`
histogram measurement with bounded attributes.

The motivation is operational visibility: discovery failures are
operationally noisy, and operators today have to grep span events to
see aggregate WebFinger reliability and latency.  A counter plus
histogram pair lets operators graph rate, latency, and outcome mix
without sampling spans.

The new `webfinger.lookup.result` attribute is one of `found`,
`not_found`, `invalid`, `network_error`, or `error`, derived from a
typed outcome returned by the refactored internal helper.  `404` and
`410` map to `not_found`; other non-2xx HTTP responses map to `error`;
JSON parse failures, max-redirection overruns, cross-protocol
redirects, malformed Location headers, and malformed `acct:` resources
map to `invalid`; `validatePublicUrl()` rejection or a thrown `fetch()`
maps to `network_error`.  `webfinger.resource.scheme` is always
present; `http.response.status_code` is recorded only when an HTTP
response was observed; `activitypub.remote.host` reflects the latest
hostname Fedify attempted, so an operator can see who actually
returned the failure even after one or more redirects.

The helper is opt-in: omitting the option emits no measurement.  This
mirrors `lookupObject()`'s pattern (packages/vocab/src/lookup.ts), so
callers that do not use OpenTelemetry are unaffected.  Wiring the
federation's `MeterProvider` through to inner `lookupWebFinger` calls
lands in later commits on this branch.

fedify-dev#316
fedify-dev#619
fedify-dev#739

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.1-codex
Adds `meterProvider` option to `getActorHandle()` so each actor handle
discovery emits one `activitypub.actor.discovery` counter and one
`activitypub.actor.discovery.duration` histogram measurement.

The motivation matches the umbrella WebFinger metrics work: a server
appears broken to users when handle resolution is slow, blocked, or
returns malformed alias data, but operators today have no aggregate
view of the actor-handle discovery path beyond per-call spans.  The
new metric covers the actor → handle resolution `getActorHandle()`
performs.  Handle/URL → object resolution is already covered by
`activitypub.object.lookup` on `lookupObject()`.

The `activitypub.actor.discovery.result` attribute is one of
`resolved`, `not_found`, or `error`.  WebFinger-level failure detail
(HTTP status, parse failure, network failure, etc.) stays on
`webfinger.lookup`; the discovery counter intentionally collapses
those into a coarser actor-level view.  The "Actor does not have
enough information…" `TypeError` is thrown via a private
`ActorHandleNotFoundError` subclass so the metric distinguishes the
real not-found terminal path from other `TypeError`-shaped failures
that come from malformed remote data (an alias that does not parse
as a URL, an invalid preferred username that breaks
`normalizeActorHandle()`, etc.); those classify as `error`.  The
subclass extends `TypeError` so the documented `throws {TypeError}`
contract is preserved.

`activitypub.remote.host` records `actor.id.hostname` when known,
omitted otherwise.  Actor IDs and handle strings are deliberately
excluded so attacker-controlled actor data cannot inflate metric
cardinality.

The helper is opt-in: omitting `meterProvider` emits no measurement,
mirroring the `lookupObject()` and `lookupWebFinger()` patterns.
When the meter provider is set, it is also forwarded to the nested
`lookupWebFinger` and `verifyCrossOriginActorHandle` calls so each
discovery emits both the actor-discovery measurements and the one
or two underlying `webfinger.lookup` measurements; the shared
`activitypub.remote.host` attribute lets operators correlate them.

Wiring the federation's `MeterProvider` through to context-level
`getActorHandle` callers lands in later commits on this branch.

fedify-dev#316
fedify-dev#619
fedify-dev#739

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.1-codex
Adds the `webfinger.handle` counter and `webfinger.handle.duration`
histogram for incoming WebFinger requests handled by Fedify, plus the
WebFinger-resource-scheme bucketing that keeps both the inbound and
the existing outbound `webfinger.lookup` metric cardinality bounded.

The new metric carries `webfinger.handle.result` (one of `resolved`,
`invalid`, `not_found`, `tombstoned`, or `error`), `webfinger.resource.scheme`
(when extractable), and `http.response.status_code` (when a response
was produced).  The result is derived from the response status code
that `handleWebFinger()` returns: `200 → resolved`, `400 → invalid`,
`404 → not_found`, `410 → tombstoned`, every other status (or a
thrown exception) → `error`.  The queried resource string is
deliberately excluded from metric attributes; it remains on the
existing `webfinger.handle` span for trace-level investigation.

Because the `resource=` query parameter is attacker-controlled, the
`webfinger.resource.scheme` attribute is bucketed to a small allow
list (`acct`, `http`, `https`, `mailto` from RFC 7565), and anything
else is recorded as `other`.  The same bucketing is applied to the
outbound `webfinger.lookup` metric attribute at the recording site,
keeping cardinality stable when a remote returns a redirect to an
unusual scheme; the `webfinger.resource.scheme` span attribute is
unchanged so traces still carry the raw value for investigation.

`Federation.fetch()` now passes the federation's `MeterProvider` to
`handleWebFinger()`, and `Context.lookupWebFinger()` defaults the
new `meterProvider` option from the federation's provider, mirroring
the `lookupObject()` plumbing.  Applications that pass a
`meterProvider` to `createFederation()` therefore get both the inbound
`webfinger.handle*` and outbound `webfinger.lookup*` metrics
automatically.

fedify-dev#316
fedify-dev#619
fedify-dev#739

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.1-codex
Adds the three new counter and histogram pairs from this branch to
the operator-facing OpenTelemetry documentation and to the in-progress
Fedify 2.3.0 changelog entry.

docs/manual/opentelemetry.md:

- Three new rows in the instrumented metrics table for
  `activitypub.actor.discovery[.duration]`,
  `webfinger.lookup[.duration]`, and `webfinger.handle[.duration]`.
- Three new metric attribute subsections describing the bounded
  result enumerations, the `webfinger.resource.scheme` allow list
  (and the fact that this bucketing applies to the metric attribute
  only, not the span attribute), the cardinality exclusions
  (resource strings, full lookup URLs, full handles, actor IDs),
  and the cross-origin caveat for nested WebFinger lookups during
  actor discovery.
- Three new rows in the semantic attributes table for the
  `*.result` attributes plus an updated `webfinger.resource.scheme`
  row noting the metric-side bucketing.

CHANGES.md:

- One new bullet under the Version 2.3.0 @fedify/fedify section
  summarising the six new instruments, the bounded attribute set,
  and the opt-in semantics.  The note draws the right line between
  federation-wired plumbing (which now covers the inbound
  `webfinger.handle` family and the federation-bound
  `Context.lookupWebFinger()`) and direct `getActorHandle()` calls,
  which remain opt-in via `GetActorHandleOptions.meterProvider`.
  The nested-WebFinger forwarding note flags the cross-origin
  verification case so operators do not expect a single shared
  remote host per discovery.

fedify-dev#316
fedify-dev#739

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.1-codex
When `lookupObject()` cannot resolve a handle through the document
loader, it falls back to `lookupWebFinger()` to find the actor URL.
This nested call previously dropped the caller's `meterProvider`
option, so an application that passed a meter provider to
`lookupObject()` (or via `Context.lookupObject()`) recorded an
`activitypub.object.lookup` measurement but no `webfinger.lookup`
measurement for the underlying probe.

Forwards the option into the nested call and adds a regression test
asserting that `lookupObject("@user@host", { meterProvider })` emits
both the `activitypub.object.lookup` counter and the
`webfinger.lookup` counter, matching the same forwarding behavior
that `getActorHandle()` already has.

fedify-dev#316
fedify-dev#739

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.1-codex
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/webfinger/src/lookup.ts (1)

340-353: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the redirect-limit off-by-one.

Line 342 rejects the Nth redirect, not the N+1st. With the current >= check, maxRedirection: 1 fails on the first 302, and the default 5 only follows 4 redirects.

Suggested fix
       redirected++;
       const maxRedirection = options.maxRedirection ?? DEFAULT_MAX_REDIRECTION;
-      if (redirected >= maxRedirection) {
+      if (redirected > maxRedirection) {
         logger.error(
           "Too many redirections ({redirections}) while fetching WebFinger " +
             "resource descriptor.",
           { redirections: redirected },
         );

As per coding guidelines, "Add regression tests that demonstrate the bug before fixing it."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/webfinger/src/lookup.ts` around lines 340 - 353, The redirect count
check in lookup.ts incorrectly rejects the Nth redirect because it uses
"redirected >= maxRedirection" (variables: redirected, options.maxRedirection,
DEFAULT_MAX_REDIRECTION) — change the comparison to "redirected >
maxRedirection" so that a maxRedirection value allows that many redirects before
failing; update the logger.error call context remains the same (logger.error)
and return path stays, and add regression tests that demonstrate the off-by-one
(e.g., set maxRedirection=1 and assert one 302 is allowed but two fail) before
making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/fedify/src/federation/middleware.ts`:
- Line 1674: The WebFinger metrics are being enabled via an implicit global
meter because calls pass this.meterProvider (which falls back to the global
provider); change the implementation so only an explicitly provided
federation-level provider is propagated: add and use a dedicated field (e.g.
explicitMeterProvider or meterProviderProvided) that is set only when a
meterProvider is passed into the constructor/config, preserve the existing
this.meterProvider getter for compatibility, and replace direct uses of
this.meterProvider in the WebFinger-related call sites (the occurrences you saw)
with that explicit field (or undefined when not set) so no global fallback is
used implicitly.

In `@packages/webfinger/src/lookup.ts`:
- Around line 273-289: The branch that special-cases resource.protocol ===
"acct:" must also handle "mailto:" so mailto: URIs with opaque paths are parsed
like acct: (extract the host from resource.pathname after the last "@", validate
server !== "" and that it contains no / ? # and resource.search/resource.hash
are empty) instead of falling through to the generic opaque-path behavior;
update the condition to include resource.protocol === "mailto:" (use the same
extraction/validation logic that references resource.pathname, resource.search,
resource.hash and the server variable) and add a regression test demonstrating
mailto:juliet@example.com resolves to querying
https://example.com/.well-known/webfinger?resource=mailto:... rather than
producing a network_error.

---

Outside diff comments:
In `@packages/webfinger/src/lookup.ts`:
- Around line 340-353: The redirect count check in lookup.ts incorrectly rejects
the Nth redirect because it uses "redirected >= maxRedirection" (variables:
redirected, options.maxRedirection, DEFAULT_MAX_REDIRECTION) — change the
comparison to "redirected > maxRedirection" so that a maxRedirection value
allows that many redirects before failing; update the logger.error call context
remains the same (logger.error) and return path stays, and add regression tests
that demonstrate the off-by-one (e.g., set maxRedirection=1 and assert one 302
is allowed but two fail) before making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 110c0153-c4da-41f4-9019-b2d8d6d02c78

📥 Commits

Reviewing files that changed from the base of the PR and between 357f74d and fde3028.

📒 Files selected for processing (13)
  • CHANGES.md
  • docs/manual/opentelemetry.md
  • packages/fedify/src/federation/metrics.test.ts
  • packages/fedify/src/federation/metrics.ts
  • packages/fedify/src/federation/middleware.ts
  • packages/fedify/src/federation/webfinger.test.ts
  • packages/fedify/src/federation/webfinger.ts
  • packages/vocab/src/actor.test.ts
  • packages/vocab/src/actor.ts
  • packages/vocab/src/lookup.test.ts
  • packages/vocab/src/lookup.ts
  • packages/webfinger/src/lookup.test.ts
  • packages/webfinger/src/lookup.ts

Comment thread packages/fedify/src/federation/middleware.ts Outdated
Comment thread packages/webfinger/src/lookup.ts Outdated
`maxRedirection`'s JSDoc reads "The maximum number of redirections to
follow", but `lookupWebFingerInternal` incremented `redirected`
before the `redirected >= maxRedirection` check.  So a caller passing
`maxRedirection: N` actually followed N-1 redirects and rejected on
the Nth, drifting by one from the documented contract.

Sibling code in *packages/vocab-runtime/src/docloader.ts* and
*packages/fedify/src/sig/http.ts* uses `>=` too, but increments the
counter on recursion *after* the check, so those paths correctly
allow N redirects.  The WebFinger lookup was the outlier.

The minimal fix flips `>=` to `>`.  After the change,
`maxRedirection: N` follows the Nth redirect and rejects on the
(N+1)th, matching both the JSDoc and the sibling code.

Adds a regression sub-step "maxRedirection: 1 follows exactly one
redirect" that fails under the old code (a single-redirect mock
returns null) and passes under the fix.  Updates the existing
"custom maxRedirection" step so its lower-bound case still
exercises the failure path under the new semantics
(`maxRedirection: 1` against a 2-redirect mock instead of
`maxRedirection: 2`).

Assisted-by: Claude Code:claude-opus-4-7
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
packages/webfinger/src/lookup.ts (1)

273-293: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle mailto: resources in the opaque-path branch too.

WEBFINGER_LOOKUP_SCHEME_WHITELIST now treats mailto as a supported resource scheme, but this parser still only extracts the lookup host from acct: URIs. mailto:juliet@example.com falls through the generic branch, builds mailto:///.well-known/webfinger, and gets reported as network_error instead of querying https://example.com/.well-known/webfinger?resource=mailto:....

Suggested fix
-  if (resource.protocol === "acct:") {
+  if (resource.protocol === "acct:" || resource.protocol === "mailto:") {
     const atPos = resource.pathname.lastIndexOf("@");
     if (atPos < 0) return { resource: null, result: "invalid" };
     server = resource.pathname.substring(atPos + 1);
-    // Per RFC 7565, an `acct:` URI's authority is bare `host`: no path,
-    // query, or fragment.  The WHATWG URL parser routes a path into
+    // Per RFC 7565, `acct:` / `mailto:` resources derive the WebFinger host
+    // from the opaque `user@host` path; no extra path, query, or fragment
+    // component is allowed.  The WHATWG URL parser routes a path into
     // `pathname` (after the `user@host` authority) and routes `?…` /
     // `#…` into `search` / `hash`, so the three components live in
     // different places and must be checked independently.  Reject any
-    // acct URI that carries an extraneous component so a malformed
+    // resource URI that carries an extraneous component so a malformed
     // input cannot pass through to a remote WebFinger lookup.

As per coding guidelines, "Add regression tests that demonstrate the bug before fixing it."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/webfinger/src/lookup.ts` around lines 273 - 293, The parser
currently only extracts the lookup host for `acct:` URIs; update the opaque-path
branch in lookup.ts that checks resource.protocol to also handle `mailto:` by
extracting the mailbox host from resource.pathname (like the `acct:` logic: find
last '@', validate server is non-empty and contains no `/?#`, and ensure
resource.search and resource.hash are empty), then set protocol and server
appropriately so mailto resources build the correct WebFinger host and query;
keep the existing else branch unchanged, and add a regression test that uses
`mailto:juliet@example.com` to assert it queries
https://example.com/.well-known/webfinger?resource=mailto:... and not produce a
network_error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/webfinger/src/lookup.test.ts`:
- Around line 217-234: The new subtests mutate global fetchMock state inside the
outer lookupWebFinger() test but only call fetchMock.hardReset() at the end of
fn(), so a failing assertion can skip teardown; wrap the outer test body (the
fn() that sets redirectCount and calls lookupWebFinger with maxRedirection: 1
and 2) in a try/finally block and move fetchMock.hardReset() (and any other
global cleanup like resetting redirectCount if needed) into the finally so
cleanup always runs even if a subtest/assertion fails—follow the same pattern
used by the metrics test.

---

Duplicate comments:
In `@packages/webfinger/src/lookup.ts`:
- Around line 273-293: The parser currently only extracts the lookup host for
`acct:` URIs; update the opaque-path branch in lookup.ts that checks
resource.protocol to also handle `mailto:` by extracting the mailbox host from
resource.pathname (like the `acct:` logic: find last '@', validate server is
non-empty and contains no `/?#`, and ensure resource.search and resource.hash
are empty), then set protocol and server appropriately so mailto resources build
the correct WebFinger host and query; keep the existing else branch unchanged,
and add a regression test that uses `mailto:juliet@example.com` to assert it
queries https://example.com/.well-known/webfinger?resource=mailto:... and not
produce a network_error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 84501e2a-3a92-41c4-b1b2-6d3f763e88a0

📥 Commits

Reviewing files that changed from the base of the PR and between fde3028 and 62cbaf3.

📒 Files selected for processing (2)
  • packages/webfinger/src/lookup.test.ts
  • packages/webfinger/src/lookup.ts

Comment thread packages/webfinger/src/lookup.test.ts Outdated
dahlia added 2 commits May 20, 2026 16:52
RFC 7033 §4.5 lists `mailto:` alongside `acct:` as a permissible
WebFinger resource form, and RFC 6068 makes `mailto:` an opaque-path
scheme: the `user@host` authority lives in `pathname`, not in
`host`.

The existing `else` branch in `lookupWebFingerInternal()` builds
`${protocol}//${server}/.well-known/webfinger` from `resource.host`,
but `new URL("mailto:juliet@example.com").host === ""`.  The WHATWG
URL parser accepts the resulting `https:///.well-known/webfinger`
and re-interprets it as `https://.well-known/webfinger`, so the
lookup fails with `network_error` against a bogus DNS name instead
of querying the user's host.  The PR also lists `mailto` in the
`webfinger.resource.scheme` metric allow list, so misclassifying a
valid `mailto:` query as `network_error` was documentation drift.

Extends the existing opaque-path branch to match both schemes,
sharing the host-extraction and the same RFC-conformance guards
(reject path, query, fragment, or empty server).  Adds a "mailto"
regression step alongside the existing "acct" and "https" tests:
it fails on the prior code and passes after the fix.

fedify-dev#772 (comment)

Assisted-by: Claude Code:claude-opus-4-7
The PR documents that passing `meterProvider` to `createFederation()`
is what enables the new WebFinger metric families (`webfinger.handle`,
`webfinger.lookup` via federation-bound `Context.lookupWebFinger`).

But the two call sites in *middleware.ts* (the `handleWebFinger`
dispatch and `ContextImpl.lookupWebFinger`) defaulted through
`this.meterProvider`, the getter that falls back to
`metrics.getMeterProvider()` when the federation-level field is
unset.  An operator with a globally-installed meter provider, who
deliberately omits `meterProvider` from `createFederation`, would
therefore still get the WebFinger metrics emitted; that contradicts
the documented opt-in contract.

The metric helpers in @fedify/webfinger and @fedify/fedify already
gate emission on `meterProvider != null`, so the fix is to thread
the raw `this._meterProvider` field (which is `undefined` when
unset) instead of the getter.  `ContextImpl.lookupWebFinger` defaults
from `options.meterProvider ?? this.federation._meterProvider`,
matching the same chain.

Inline comments at both call sites explain why we bypass the getter,
so future readers do not "fix" it back to the global-fallback shape.
`ContextImpl.lookupObject` has the same `this.meterProvider` default
but predates this PR; that one is intentionally out of scope here.

fedify-dev#772 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

/gemini review

The outer `lookupWebFinger()` test installs the global fetch spy with
`fetchMock.spyGlobal()` and cleans up at the bottom with
`fetchMock.hardReset()`, with no try/finally between them.  If any
`await t.step(...)` propagates a rejection, cleanup is skipped and
the global fetch spy leaks into the metric test below as well as any
other test files that run later in the same process.

The same pattern was already fixed for the *metrics* test in this
file via an earlier commit.  The recent additions to the outer test
(the new "mailto" step, the redirect / scheme metric setups, the
`maxRedirection: 1` regression step) made the leak risk concrete.

Wraps the body from `fetchMock.spyGlobal()` to `fetchMock.hardReset()`
in `try { ... } finally { fetchMock.removeRoutes(); fetchMock.hardReset(); }`,
matching the metric test's pattern.

The diff looks large because the body re-indents one level, but the
semantic change is just the try/finally wrapper and a
`removeRoutes()` call in the finally clause.

fedify-dev#772 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

/gemini review

@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

@codex review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry metrics for WebFinger and actor handle discovery paths, adding counters and histograms for outgoing lookups, incoming requests, and discovery attempts. The implementation includes attribute bucketing for resource schemes to prevent metric cardinality inflation and refines the maxRedirection logic to correctly follow the specified number of redirects. Comprehensive documentation and tests have been added across the affected packages. I have no feedback to provide.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry metrics for WebFinger and actor handle discovery, enabling operators to monitor discovery rates, latency, and outcomes. The changes span core federation logic, documentation, and test suites, implementing an opt-in meter provider pattern to maintain cardinality safety. A review comment correctly identified that the current validation for 'mailto:' URIs is overly strict regarding query parameters and provided a code suggestion to align the implementation with RFC 6068.

Comment thread packages/webfinger/src/lookup.ts
A previous commit on this branch added a shared `acct: || mailto:`
branch in `lookupWebFingerInternal()` for opaque-path host
extraction, but kept the same `resource.search !== "" ||
resource.hash !== ""` rejection for both schemes.

That rejection is overly strict for `mailto:`.  RFC 6068 §2
explicitly allows `mailto:` URIs to carry `?hfields=…` header
fields (and fragment identifiers), so for example
`mailto:user@example.com?subject=Hi` is a syntactically valid
mailto URI and a legitimate WebFinger resource per RFC 7033 §4.5.
The current code returned `result=invalid` for those inputs.

RFC 7565 §3 explicitly forbids path, query, or fragment on `acct:`
URIs, so the asymmetry matters: the search/hash rejection stays on
`acct:` and is dropped for `mailto:`.

Splits the validation accordingly.  The bare-authority character
check (no /, ?, or # in the extracted server; non-empty) still
applies to both schemes.  The strict search/hash rejection now
applies only when `resource.protocol === "acct:"`.

Adds a regression step "mailto with hfields" with a `?subject=Hi`
resource, asserting the lookup forwards the full resource URI to
the WebFinger endpoint and resolves successfully.  The new step
fails on the prior code and passes after the fix.

Inline comments cite the RFCs (6068 §2, 7565 §3) so future readers
see why the two schemes diverge.

fedify-dev#772 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements OpenTelemetry metrics for WebFinger and actor handle discovery. It introduces new metrics families—webfinger.lookup, webfinger.handle, and activitypub.actor.discovery—along with corresponding duration histograms. The changes include result classification logic, attribute bucketing for cardinality safety, and integration into the federation middleware and vocabulary lookups. Extensive tests and documentation updates are also provided. I have no feedback to provide.

Copy link
Copy Markdown
Contributor

@2chanhaeng 2chanhaeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little niptick!

Comment thread packages/webfinger/src/lookup.ts Outdated
The catch block in `lookupWebFinger`'s span wrapper used to reassign
`outcome = { resource: null, result: "error" }` before rethrowing.
That literal is identical to the initial assignment three lines
above, and the only path that can reach the catch with `outcome`
holding a different value would require either
`await lookupWebFingerInternal()` to resolve and then
`span.setStatus()` (or the simple property access that follows it)
to throw, which is not a realistic scenario.

The reassignment was therefore dead code.  Removed it, and added an
inline comment explaining that `outcome` is initialised as the
error placeholder specifically so the `finally` block always has a
valid value to record when the internal call rejects; the catch no
longer has to touch it.

fedify-dev#772 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

/gemini review

@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

@codex review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements OpenTelemetry metrics for WebFinger and actor handle discovery across the fedify, vocab, and webfinger packages. It introduces new recording functions, updates internal logic to propagate the meterProvider, and adds extensive tests to verify the new metrics. I have no feedback to provide on these changes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a976bcfbef

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/fedify/src/federation/webfinger.ts
`classifyWebFingerHandleResult()` mapped any HTTP 200 to
`webfinger.handle.result=resolved`, but `handleWebFingerInternal()`
delegates five different miss paths to the caller-provided
`onNotFound(request)` callback: no actor dispatcher, alias mapper
miss, identifier resolution miss, dispatcher returned null, and
mismatched `acct:` host.  The callback can legally return any status
code, including a framework-style 200 fallback page.  In that setup
unresolved WebFinger requests were counted as successful discoveries
and skewed the metric's success/error ratios and latency slices.

Wraps `options.onNotFound` once in the outer `handleWebFinger()` so
we know which response object came from the callback.
`classifyWebFingerHandleResult()` now takes that wrapped response as
a second argument and short-circuits to `not_found` when the
returned `Response` is identity-equal to the callback's, regardless
of the status code.  The wrapping uses object spread, so the rest of
`WebFingerHandlerParameters` (including the meterProvider option
itself) is forwarded untouched.

Adds a regression step "records result=not_found when onNotFound
returns 200" with a callback returning
`new Response("custom fallback", { status: 200 })`.  It asserts
`webfinger.handle.result=not_found` and
`http.response.status_code=200`.  The step fails on the prior code
(which records `resolved`) and passes after the fix.

fedify-dev#772 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

@codex review

@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 20, 2026

/gemini review

@dahlia dahlia requested a review from 2chanhaeng May 20, 2026 09:48
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements OpenTelemetry metrics for WebFinger and actor handle discovery paths across the Fedify codebase. It introduces new instrumentation in @fedify/fedify, @fedify/vocab, and @fedify/webfinger to track discovery rates, latency, and terminal outcomes. The implementation follows an opt-in pattern using a meterProvider to ensure metrics are only recorded when explicitly configured, and includes comprehensive updates to documentation and regression tests. I have no feedback to provide as there were no review comments.

@dahlia dahlia merged commit 3fe7802 into fedify-dev:main May 20, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/actor Actor dispatcher related component/federation Federation object related component/otel OpenTelemetry integration component/webfinger WebFinger related

Development

Successfully merging this pull request may close these issues.

OpenTelemetry metrics for WebFinger and actor discovery

2 participants