docs: add public API architecture plan and ADRs by epipav · Pull Request #1879 · linuxfoundation/insights

epipav · 2026-05-04T16:42:57Z

No description provided.

Signed-off-by: anilb <epipav@gmail.com>

Copilot

Pull request overview

Adds a documentation set for the planned LFX Insights Public API, including a detailed project plan, an architecture review packet, and a set of ADRs intended to lock down key contract and infrastructure decisions for a standalone /api service.

Changes:

Introduces a comprehensive public API project plan covering architecture, epics, rollout stages, and observability strategy.
Adds an “architecture review” doc set (overview, decisions, domain context) to present the proposal for approval.
Adds ADRs formalizing major decisions (framework, versioning contract, auth model, caching, docs stack, etc.).

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
docs/PUBLIC_API_PLAN.md	Full project plan for the standalone public API service (architecture, epics, rollout, metrics).
docs/CONTEXT.md	Canonical domain language + wire-format conventions for the public API.
docs/architecture-review/01-overview.md	Architecture review overview document for stakeholders/approvers.
docs/architecture-review/02-decisions.md	Summary of “ADR-bar” decisions with links to ADRs.
docs/architecture-review/03-context.md	Architecture-review version of the canonical domain context.
docs/adr/0001-fastify-over-nestjs.md	ADR selecting Fastify over NestJS/Express/Hono.
docs/adr/0002-api-at-repo-root.md	ADR placing the service at repo-root `api/` (not under workers).
docs/adr/0003-tolerant-reader-versioning.md	ADR defining `/v1-alpha` → `/v1` stability and additive-only contract.
docs/adr/0004-server-to-server-cors-deny.md	ADR for server-to-server only (CORS) posture in v1.
docs/adr/0005-tiers-control-rate-limits-only.md	ADR for tier impact limited to rate limits in v1.
docs/adr/0006-long-lived-api-keys.md	ADR for long-lived, manually-rotated API keys.
docs/adr/0007-collections-only-permission-check.md	ADR limiting permission checks to Collections endpoints only.
docs/adr/0008-typebox-code-first-openapi.md	ADR choosing TypeBox code-first schemas as OpenAPI source.
docs/adr/0009-api-key-required-for-all-requests.md	ADR requiring API key auth for every request (no anonymous access).
docs/adr/0010-billing-bundled-with-lfx-membership.md	ADR bundling API access with existing LFX membership.
docs/adr/0011-pagination-page-pagesize-zero-indexed.md	ADR standardizing zero-indexed `page`/`pageSize` pagination.
docs/adr/0012-url-port-strategy-hybrid.md	ADR describing hybrid URL port/rename strategy.
docs/adr/0013-origin-cache-only-private-cache-control.md	ADR specifying origin-only Redis caching + `Cache-Control: private`.
docs/adr/0014-camelcase-json-iso8601-dates.md	ADR standardizing camelCase JSON + ISO-8601 UTC timestamps.
docs/adr/0015-api-keys-stored-in-auth0.md	ADR storing/managing API keys in Auth0 via Management API.
docs/adr/0016-vitepress-scalar-api-docs.md	ADR choosing VitePress + Scalar for API docs in `api/docs/`.
docs/adr/0017-collections-queries-not-shared.md	ADR keeping Collections SQL read queries in `/api` (not shared).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: anilb <epipav@gmail.com>

emsearcy

Left some comments throughout. A couple high-level notes:

Are you familiar with https://github.com/linuxfoundation/lfx-architecture-decisions/tree/main/decisions ? In particular I didn't see anything here about logging framework.

(Caveat, ADR-0003 requires sending both OTEL and DataDog-formatted trace and span IDs (both 64-bit unsigned ints, as dd.trace_id and dd.span_id) -- this is out of date; Datadog now handles proper OTEL trace/span IDs natively.)

Also, while I get that you may all be Javascript devs, and I think overall your choices are really good, we often times prefer Go when possible for backend services, because it tends to be easier to maintain long term (forward portability guarantees). But I get it if JS is deemed easier by y'all.

emsearcy · 2026-05-07T16:42:14Z

+
+Keys do not expire automatically. Multiple active keys per user are supported for zero-downtime rotation (mint new → switch → revoke old). Revocation is enforced by deleting the key from Auth0 — the next request using it fails JWKS verification instantly. No deny-list needed.
+
+Refresh tokens and expiring tokens can be introduced in v2 — long-lived keys are a simplicity decision for v1.


This is in conflict with ADR 15. Auth0 cannot provide long-lived API keys. It can either provide client_credential grants (exchange a long-lived client ID and secret for a short-lived access token) or authorization_code + refresh_token grants (login to get a refresh token, exchange the refresh token for a short-lived access token).

So either we need to change ADR 15 to adopt Postgres storage of long-lived keys, or change this one and adopt OAuth2 patterns earlier than v2.

@jonathimer imo this would also be a business decision as it will affect user experience:

Long-lived keys (change ADR 15, store in Postgres):

User generates a key once in the dashboard, pastes it into their app/script, and it just works forever (or until they revoke it).

No login flow, no token refresh logic on their side.

Same key works across machines, CI, scripts — easy to share within a team (which is also the security downside).

If the key leaks, it's valid until someone manually revokes it.

Not tied up with OAuth2.

OAuth2 now (change this ADR):

Two sub-flavors depending on the use case:

Machine-to-machine (client credentials): user gets a client ID + secret, their code exchanges them for a short-lived token and refreshes it when it expires. Extra code on their side, but standard.

User-facing (auth code + refresh): user goes through a login/consent screen in the browser, app stores a refresh token. Better for apps acting on behalf of a user, but means no headless "paste and go."

Tokens expire, so leaks have a limited blast radius.

More upfront integration work for the user, but it's a known pattern.

You're right that this was internally inconsistent. We've adopted the OAuth2 refresh token + short-lived access token model rather than trying to force long-lived tokens through Auth0.
What changed:

ADR-0006 is renamed and rewritten as 0006-refresh-and-short-lived-access-tokens.md. The "API key" a customer receives is now explicitly a refresh token issued by LFX Self-Serve at app.lfx.dev/settings.

Customers exchange it at POST api.insights.linuxfoundation.org/v1/auth/token — a thin proxy Insights exposes to Self-Serve's token endpoint — to get a short-lived access token (~15 min). The Bearer value sent to the Insights API is always the short-lived access token. - ADR-0015 updated to align: keys are refresh tokens from LFX Self-Serve, not Auth0-managed credentials.

emsearcy · 2026-05-07T16:43:31Z

+
+### Pagination: `page` + `pageSize`, zero-indexed — [docs/adr/0011](../adr/0011-pagination-page-pagesize-zero-indexed.md)
+
+All paginated endpoints use `page` (zero-indexed) + `pageSize` query params, returning `{ data, page, pageSize, total }`. The existing Nuxt codebase already uses this convention as the dominant pattern — preserving it avoids off-by-one translation bugs during the port. A handful of Nuxt endpoints use `limit`/`offset` instead; those are normalized to `page`/`pageSize` at port time so the public API stays consistent. External developers used to 1-based pagination will need to start at `page=0` — this is called out prominently in the docs quickstart.


This is OK, but I would challenge that opaque pagination tokens/cursors are a better design choice when you're starting from scratch. Also I recommend that any pagination standard also define sorting mechanism (hard coded vs. user-selected, but it needs to be defined).

The user cannot "cheat" and grab the first page to get a total count, then grab all remaining pages in parallel: requests are forced to be serialized

Cursors can (and should!) be implemented to provide atomicity, which page offset numbers cannot: the cursor acts as a sort of session to ensure consistency while paging. offset based pagination can introduce repeats or missed data entries. That is: between fetching page N and page N+1, if an item is added which, based on your sort semantics, adds or removes data in pages 0-N, it shifts the results by 1 either way. A lost entry is caused when the item which would have been the first item of page N+1 falls back into the last item of page N, but we don't see it because we already fetched page N, and a duplicate is caused by an insertion causing the last item of page N (that we already fetched) to also be the first item of page N+1

Agreed on both points. Two changes:

Cursor-based pagination is now the standard — ADR-0011 is renamed and rewritten as 0011-pagination-cursor-based.md. We dropped the zero-indexed page/pageSize offset model. Your atomicity example (insert between page N and N+1 → first item of N+1 falls back to last of N, silently skipped) is incorporated verbatim into the ADR as reason Frontend init #1.

Sorting is now defined: each paginated endpoint declares a closed allow-list of accepted sort values in its TypeBox schema. Wire format: ?sort=field_direction (e.g. name_asc, commits_desc) — same convention as the existing Nuxt layer. Every accepted value must be index-backed. Removing an allowed value or changing an endpoint's default sort is a breaking change under ADR-0003. Defined in the Sort order section of ADR-0011 and in the wire-format conventions in CONTEXT.md.

emsearcy · 2026-05-07T16:50:37Z

+_Avoid:_ account, customer, client
+
+**Organization** (`org_id`)
+The LFX organization a User belongs to, extracted from the JWT. Used as the shared bucket for rate-limit quotas — all API keys belonging to users in the same org draw from one pool.


FWIW, Auth0 JWTs do not carry organization information at present. Also need to define "belongs to". There are org-admin authorized individuals (key contacts for memberships, mostly). If the scope of "belongs to" is intended to capture all employees, note that we leave the realm of authorized identities and have moved over to, essentially, self-attestations (I can say I work for anyone). Alignment of employee identity (and ongoing validation thereof) by known employer domains is not presently in scope for LFX as far as I know.

On the JWT org claim: the org claim is now explicitly attributed to the LFX Self-Serve access token, not an Auth0 JWT. The auth model no longer involves Auth0 at the Insights API layer — customers get a refresh token from app.lfx.dev/settings, exchange it at POST /v1/auth/token (an Insights-proxied endpoint to Self-Serve), and receive a short-lived access token that Self-Serve signs and that Insights JWKS-verifies.
On "belongs to": narrowed to "authorized Key Contact of an organization with an active LFX membership — not every employee or self-attested affiliate." The Key Contact check lives in Self-Serve (OpenFGA against v2_organization entities), but the precise moment depends on an open product question (ADR-0015 Q1):

If a new Insights-scoped refresh token is issued: check happens at issuance — non-Key-Contacts can't obtain the token at all.

If the existing PAT is reused: everyone holds a PAT already, so the check moves to POST /v1/auth/token exchange time — Self-Serve refuses to mint an Insights access token for non-Key-Contacts even if they possess the PAT.

Either way, the check is in Self-Serve, and the Insights API doesn't touch a membership system. Updated in CONTEXT.md, 03-context.md, ADR-0010, and the Relationships section.
One assumption we'd like your input on: the spec assumes the LFX Self-Serve access token carries org and tier claims that Insights can read at request time for rate limiting. If those claims can't be added, the enforcement model changes significantly. Can you confirm whether Self-Serve can include these?

emsearcy · 2026-05-07T16:53:10Z

+_Avoid:_ error body, error payload
+
+**Request ID**
+A ULID generated per-request, propagated as the `X-Request-Id` response header and attached to all log lines and OTel spans. Used by support for tracing a specific request across systems.


OTEL defines the standard for trace IDs and span IDs, AND how they are propagated. We should not invent our own. This is imperative for universal tracing.

👍 What changed (ADR-0019, ADR-0018, CONTEXT.md):

X-Request-Id response header dropped entirely.

W3C traceparent is now the sole HTTP propagation channel — honoured inbound, injected outbound automatically by the OTel SDK. No custom header.

The only customer-facing exposure of the trace ID is requestId inside the error envelope JSON — that's the value a customer quotes in a support ticket, and it lives in a schema we already own, not a new HTTP header.

emsearcy · 2026-05-07T16:58:50Z

+┌─────────────────────┐  1. create key   ┌─────────────────────────────────────┐
+│  User (browser)     │ ───────────────▶ │  LFX Insights frontend              │
+│                     │                  │  /settings/api-keys                 │
+│                     │ ◀─────────────── │  (membership check → Auth0 Mgmt API)│


No where in the spec is it defined how the membership check is wired up. (e.g. if this is using OpenFGA relationships against v2_organization entities, as discussed in F2F?) ADR 10 references this but points to Public API plan, and I don't see anything in this file actually defining it.

Ideally our spec would be prescriptive about the implementation. Or at least a statement "this is out of scope" or explains the human contract needed to fulfill the behavior -- so that AI implementation doesn't go off the rails and try to build something "wrong".

Added a "Where membership is enforced" section to ADR-0010 that's explicit about the implementation

joanagmaia · 2026-05-07T17:42:16Z

Hey @jonathimer can you review the main files of this architecture proposal to double check if it's aligned with the initial PRD spec? It would basically be all files outside of docs/adr.
To confirm:

Where users generate their API Key.
Tier access limitation -> On the proposal only rate limiting.
Endpoints access -> At the moment all endpoints available in v1.

Also left you a comment based on a comment from Eric to try to understand how we should define tiers and org mapping with the user.

joanagmaia · 2026-05-07T17:38:04Z

+### Key management UI
+
+API keys are created and managed by users inside LFX Insights (not a separate LFX platform). Key creation is gated on the user's Organization holding an active LFX membership. The UI ([E15](../PUBLIC_API_PLAN.md#epic-e15--api-key-management-ui-lfx-insights-frontend)) covers: membership check, create/list/revoke keys, one-time key display on creation, and closed-alpha access gating. This is a hard dependency for the closed-alpha launch.


@jonathimer we some input on how users should create or have access to their API Keys.
What we proposed in this document was: "users are able to get a long-lived token on Insights with their org entitlements". With this information we would be able to also have access to the tier.

@emsearcy mentioned that this is not possible as of today. In order for Eric to be able to advise us on how to get the token and the required info, we would need to understand:

If "belongs to an organization" implies all employees of an org

OR only allowlist-authorized individuals (key contacts or similar manual curated list)

#1879 (comment)
From Eric:

FWIW, Auth0 JWTs do not carry organization information at present. Also need to define "belongs to". There are org-admin authorized individuals (key contacts for memberships, mostly). If the scope of "belongs to" is intended to capture all employees, note that we leave the realm of authorized identities and have moved over to, essentially, self-attestations (I can say I work for anyone). Alignment of employee identity (and ongoing validation thereof) by known employer domains is not presently in scope for LFX as far as I know.

Signed-off-by: anilb <epipav@gmail.com>

Copilot

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 6 comments.

Signed-off-by: anilb <epipav@gmail.com>

Copilot

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 9 comments.

Signed-off-by: anilb <epipav@gmail.com>

epipav · 2026-05-12T11:10:04Z

@emsearcy, thanks for the review

Answered the individual inline comments in their respective threads.

About logging framework: Added ADR-0018 (structured JSON logging via pino), inheriting lfx-architecture-decisions/0002.
dd.trace_id / dd.span_id: Dropped — Datadog now ingests OTel-format trace/span IDs natively. ADR-0019 updated to reflect this, inheriting lfx-architecture-decisions/0003. W3C traceparent is the sole propagation channel.

emsearcy

I'm still concerned with the ambiguity of the API keys. Self Service (which is an Angular UI) should not be handling token exchange. Refresh tokens are probably not the best option for this, either. We probably would be better off with our own signed tokens, or possibly some kind of token service for long-lived tokens. I'll chat with the SSO team in DevOps this coming week.

emsearcy · 2026-05-18T01:25:25Z

+
+The customer-facing credential is a **refresh token** issued by the LFX Self-Serve App at `app.lfx.dev/settings`. It does not expire automatically. Rotation is encouraged (documented best practice) but never enforced — multiple active refresh tokens per User are supported so rotation is zero-downtime: mint new → switch integrations → revoke old.
+
+Customer code exchanges the refresh token for a short-lived **access token** (~15 min; exact lifetime confirmed at T-015) via `POST api.insights.linuxfoundation.org/v1/auth/token`. That endpoint is a thin proxy to LFX Self-Serve's `/token` endpoint (RFC 6749 §6 `grant_type=refresh_token`). The Insights API forwards the request and returns the response verbatim — it does not mint, validate, or store refresh tokens.


Self service doesn't have a /token endpoint?. Auth0 is our IdP, and it's the one that exchanges refresh tokens for access tokens.

Next, our refresh tokens typically are configured with auto-rotation, which means each time you turn in the refresh token, you get not only an access token, but also your next refresh token.

This is a protection against lost/stolen tokens: if somebody else uses your refresh token, and you attempt to exchange the same refresh token, both are automatically invalidated, forcing a new interactive login.

All of these flows are based on the expectation that refresh tokens are automatically managed in the application state of an application (like a mobile app or desktop app) that can both handle an interactive login and manage its own state.

emsearcy · 2026-05-18T01:29:41Z

+
+1. **No natural expiry on a leaked token.** A long-lived JWT that escapes into logs, error reports, or a compromised system stays valid indefinitely. Revocation requires Insights to run a per-request introspection-with-cache check — extra infrastructure for a problem the refresh-token model solves natively (leaked access token expires in ~15 min; leaked refresh token can only mint new access tokens, which the legitimate owner can stop by revoking it).
+
+2. **Revocation is ambiguous.** "Revoking" a long-lived JWT that has been cryptographically signed means nothing to a verifier that only checks the signature — the token remains valid until the key is rotated. Rotating the JWKS key revokes all tokens at once, not just one user's. The refresh-token model avoids this entirely.


correct, though I will add, revocation of a JWT is a thing, but it's usually done by creating a denylist: a denied JWT (even an access token!) has its unique ID (jti) added to a revocation list. Since JWTs would be expected to expire, you only need to keep it on the revocation list for the token lifetime—this avoids unbounded revocation lists. Of course, that doesn't make sense with an "expiration-less" JWT.

really one would be more likely to simple use a database of API keys, like GitHub PATs, rather than JWTs, for non-expiring credentials.

emsearcy · 2026-05-18T01:31:22Z

+
+API keys are refresh tokens issued by the LFX Self-Serve App at `app.lfx.dev/settings`. The Insights API exposes a proxied `/v1/auth/token` endpoint (forwarding to Self-Serve's `/token`) so customers configure only one host. Customer code exchanges refresh tokens for short-lived access tokens via that proxy (per ADR-0006); the Insights API receives only access tokens on actual API requests. The Insights API is a verifier only: it fetches the LFX Self-Serve JWKS endpoint, verifies the access token's signature, and reads identity + authorization claims from the verified payload. Insights stores no keys, runs no key-management UI, and has no dependency on the Auth0 Management API.
+
+> **Note:** this decision assumes LFX Self-Serve can support the required token model (Key Contact gating, `org`/`tier` claims, JWKS exposure, and the `/token` proxy endpoint). Coordination with the Self-Serve team is required at T-015 before implementation — the exact shape of the solution may change based on what Self-Serve can provide.


What does Key Contact gating mean in this context? If you are providing different tiers, for example: are you only expecting this to gate on Key Contacts? or also to provide a "tier" claim in the ticket?

emsearcy · 2026-05-18T01:32:00Z

+| `iss` | LFX Self-Serve issuer URL — used to select the right JWKS and reject foreign tokens. |
+| `sub` | User ID — used for revocation reference and as the `customer_id` span attribute in APM traces. |
+| `org` | LFX Organization ID — drives the rate-limit pool key (all Key Contacts in the same org share a pool). **Assumption:** Self-Serve includes this in the access token. Exact claim name and feasibility confirmed at T-015. |
+| `tier` | LFX membership tier (`silver` / `gold` / `platinum`) — drives rate-limit pool size and any future per-route tier gating. **Assumption:** Self-Serve includes this in the access token. Confirmed at T-015. |


This is a big assumption—this is not trivial. Self Service does not create access tokens—so this is more of a need for some kind of Auth0 extensibility (like CDP roles is today). Also, projects have different tier names/structures (like Premier/Silver for ASWF): how do you want to handle that for the product side? Or, is claim only showing the organization's LF membership tier, e.g. I'm a Gold at CNCF but Silver at LF, so I get "Silver"?

emsearcy · 2026-05-18T01:46:30Z

+| Claim | Purpose |
+|---|---|
+| `iss` | LFX Self-Serve issuer URL — used to select the right JWKS and reject foreign tokens. |
+| `sub` | User ID — used for revocation reference and as the `customer_id` span attribute in APM traces. |


Hi — assuming we end up with JWTs in the final model - you might instead consider the claim http://lfx.dev/claims/username which we often add to access tokens to carry an LFID (the one you see with an auth0| prefix is NOT reliably a LF username. It is unique (and immutable) per user, so you can use it for internal access if you need to, but it is a violation of our LFID integration guide to attempt to extract an LFID by stripping the "auth0|" prefix.

You might consider using user.id or enduser.id for the span attribute, as both of the these are semantic conventions (and are already indexed facets in Datadog!)

https://opentelemetry.io/docs/specs/semconv/registry/attributes/enduser/
https://opentelemetry.io/docs/specs/semconv/registry/attributes/user/

(I think "user" is more used in applications or webapp RUM, and "enduser" is more appropriate for APIs, but I could be mistaken)

docs: add public API architecture plan and ADRs

3e96de8

Signed-off-by: anilb <epipav@gmail.com>

epipav requested review from Copilot, joanagmaia and joanreyero May 4, 2026 16:42

Copilot started reviewing on behalf of epipav May 4, 2026 16:43 View session

epipav requested review from emsearcy and jonathimer May 4, 2026 16:45

Copilot AI reviewed May 4, 2026

View reviewed changes

docs: fix review comments

875c179

Signed-off-by: anilb <epipav@gmail.com>

emsearcy requested changes May 7, 2026

View reviewed changes

joanagmaia reviewed May 7, 2026

View reviewed changes

docs: address auth, OTel, sort, and membership review comments

5671597

Signed-off-by: anilb <epipav@gmail.com>

Copilot AI review requested due to automatic review settings May 12, 2026 08:57

Copilot started reviewing on behalf of epipav May 12, 2026 08:58 View session

Copilot AI reviewed May 12, 2026

View reviewed changes

epipav added 2 commits May 12, 2026 11:16

docs: clarify PAT model impact on membership check timing

da9e452

Signed-off-by: anilb <epipav@gmail.com>

docs: flag org/tier claim assumptions and fix copilot comments

98d0c45

Signed-off-by: anilb <epipav@gmail.com>

Copilot AI review requested due to automatic review settings May 12, 2026 10:24

Copilot started reviewing on behalf of epipav May 12, 2026 10:25 View session

docs: trim open questions to rate limits and PAT model

564efe5

Signed-off-by: anilb <epipav@gmail.com>

Copilot AI reviewed May 12, 2026

View reviewed changes

docs: fix tier goals, customer_id, and lib count inconsistencies

33daec6

Signed-off-by: anilb <epipav@gmail.com>

epipav requested a review from emsearcy May 12, 2026 11:10

epipav self-assigned this May 12, 2026

emsearcy reviewed May 18, 2026

View reviewed changes


		Keys do not expire automatically. Multiple active keys per user are supported for zero-downtime rotation (mint new → switch → revoke old). Revocation is enforced by deleting the key from Auth0 — the next request using it fails JWKS verification instantly. No deny-list needed.

		Refresh tokens and expiring tokens can be introduced in v2 — long-lived keys are a simplicity decision for v1.


		### Pagination: `page` + `pageSize`, zero-indexed — [docs/adr/0011](../adr/0011-pagination-page-pagesize-zero-indexed.md)

		All paginated endpoints use `page` (zero-indexed) + `pageSize` query params, returning `{ data, page, pageSize, total }`. The existing Nuxt codebase already uses this convention as the dominant pattern — preserving it avoids off-by-one translation bugs during the port. A handful of Nuxt endpoints use `limit`/`offset` instead; those are normalized to `page`/`pageSize` at port time so the public API stays consistent. External developers used to 1-based pagination will need to start at `page=0` — this is called out prominently in the docs quickstart.

		### Key management UI

		API keys are created and managed by users inside LFX Insights (not a separate LFX platform). Key creation is gated on the user's Organization holding an active LFX membership. The UI ([E15](../PUBLIC_API_PLAN.md#epic-e15--api-key-management-ui-lfx-insights-frontend)) covers: membership check, create/list/revoke keys, one-time key display on creation, and closed-alpha access gating. This is a hard dependency for the closed-alpha launch.


		The customer-facing credential is a refresh token issued by the LFX Self-Serve App at `app.lfx.dev/settings`. It does not expire automatically. Rotation is encouraged (documented best practice) but never enforced — multiple active refresh tokens per User are supported so rotation is zero-downtime: mint new → switch integrations → revoke old.

		Customer code exchanges the refresh token for a short-lived access token (~15 min; exact lifetime confirmed at T-015) via `POST api.insights.linuxfoundation.org/v1/auth/token`. That endpoint is a thin proxy to LFX Self-Serve's `/token` endpoint (RFC 6749 §6 `grant_type=refresh_token`). The Insights API forwards the request and returns the response verbatim — it does not mint, validate, or store refresh tokens.


		1. No natural expiry on a leaked token. A long-lived JWT that escapes into logs, error reports, or a compromised system stays valid indefinitely. Revocation requires Insights to run a per-request introspection-with-cache check — extra infrastructure for a problem the refresh-token model solves natively (leaked access token expires in ~15 min; leaked refresh token can only mint new access tokens, which the legitimate owner can stop by revoking it).

		2. Revocation is ambiguous. "Revoking" a long-lived JWT that has been cryptographically signed means nothing to a verifier that only checks the signature — the token remains valid until the key is rotated. Rotating the JWKS key revokes all tokens at once, not just one user's. The refresh-token model avoids this entirely.


		API keys are refresh tokens issued by the LFX Self-Serve App at `app.lfx.dev/settings`. The Insights API exposes a proxied `/v1/auth/token` endpoint (forwarding to Self-Serve's `/token`) so customers configure only one host. Customer code exchanges refresh tokens for short-lived access tokens via that proxy (per ADR-0006); the Insights API receives only access tokens on actual API requests. The Insights API is a verifier only: it fetches the LFX Self-Serve JWKS endpoint, verifies the access token's signature, and reads identity + authorization claims from the verified payload. Insights stores no keys, runs no key-management UI, and has no dependency on the Auth0 Management API.

		> Note: this decision assumes LFX Self-Serve can support the required token model (Key Contact gating, `org`/`tier` claims, JWKS exposure, and the `/token` proxy endpoint). Coordination with the Self-Serve team is required at T-015 before implementation — the exact shape of the solution may change based on what Self-Serve can provide.

Conversation

epipav commented May 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

emsearcy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

epipav May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

epipav May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joanagmaia commented May 7, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

epipav commented May 12, 2026

Uh oh!

emsearcy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

epipav May 12, 2026 •

edited

Loading

epipav May 12, 2026 •

edited

Loading