RFC: Orthogonal Authorization

## Context

The current authorization graph is represented by `user_grants` and `role_grants`. Each row is an edge:

```text
user_grants: user_id      -> object_role
role_grants: subject_role -> object_role
```

The current `capability` column is a scalar `read/write/admin` value over that edge:

```text
read < write < admin
```

SQL/RLS/PostgREST authorization logic is largely composed of policies that compare `capability >= 'read'` or `capability >= 'admin'`. GraphQL and other control-plane API paths increasingly authorize through the Rust snapshot instead.

This proposal keeps the grant graph as-is (i.e., still uses `user_grants` and `role_grants`) and introduces a GraphQL-only capability set on each existing grant edge. The scalar `capability` column remains the representation used by SQL/RLS/PostgREST. The new capability set is the authorization source for GraphQL authorization.

During migration, the GraphQL capability set is additive: GraphQL authorization combines capabilities derived from the scalar `capability` with explicit capabilities from `capability_set`, while SQL/RLS/PostgREST continue to ignore `capability_set`.

## Objectives

- Preserve existing authorization behavior for current SQL/RLS/PostgREST and existing GraphQL checks.
- Avoid duplicate grant tables or a second grant graph.
- Allow new GraphQL APIs to authorize billing, spec editing, grant management, and future powers independently.
- Avoid a flag day: the GraphQL authorization path can read the new column while existing callers continue to ask for the access they require.
- Preserve the current security invariant that read/write grants do not become transitive.

## Non-Objectives

- This does not replace SQL/RLS authorization in the initial migration.
- This does not make PostgREST understand the new capability set.
- This does not require backfilling every existing grant before use.

## Data Model

Add a GraphQL-only capability set to the existing grant rows.

```sql
CREATE TYPE internal.authz_capability AS ENUM (
  'catalog_read',
  'journal_append',
  'spec_edit',
  'billing',
  'manage_grants',
  'assume'
);

ALTER TABLE public.user_grants
  ADD COLUMN capability_set internal.authz_capability[] NOT NULL DEFAULT '{}';

ALTER TABLE public.role_grants
  ADD COLUMN capability_set internal.authz_capability[] NOT NULL DEFAULT '{}';
```

`capability_set` is the explicit set of orthogonal capabilities stored on the grant edge. This additive schema change allows new GraphQL APIs to authorize against explicit capabilities on rows that already have a current `capability`. For example, a user can keep current `read` access while GraphQL also authorizes `billing` from `capability_set`.

Some grants should carry authority only in `capability_set` and should have no SQL/RLS/PostgREST effect. A billing user, for example, may need `billing` without `catalog_read`, and therefore should not receive current `read`. Before writing grants of that shape, make the current `capability` nullable:

```sql
ALTER TABLE public.user_grants
  ALTER COLUMN capability DROP NOT NULL;

ALTER TABLE public.role_grants
  ALTER COLUMN capability DROP NOT NULL;
```

The existing `valid_capability` check constraints continue to allow only `read`, `write`, and `admin`. Making `capability` nullable allows a row to carry authority only in `capability_set` without adding a neutral enum value.

> **NOTE**: Do not use `x_00` as the `capability` value for a grant whose authority is carried only in `capability_set`. Some existing SQL calls `auth_roles()` with the default minimum. A row with `capability = 'x_00'` would be reachable by those callers. A row with `capability = NULL` is not reachable through `capability >= ...` comparisons.

The meanings are:

```text
capability = read/write/admin
  SQL/RLS/PostgREST capability for the row.

capability = NULL
  No SQL/RLS/PostgREST authorization effect.

capability_set = '{}'
  No GraphQL-visible capabilities beyond those derived from `capability`.

capability_set = non-empty array
  Explicit GraphQL-visible capabilities on this grant row, in addition to any
  capabilities derived from `capability`.
```

The GraphQL-visible capability set for a row is the union of the scalar `capability` projection and `capability_set`. This means new GraphQL-only capabilities can be added to existing grant rows without replacing or backfilling the current `capability`.

## PostgREST Isolation

The new column must not be exposed through PostgREST.

Today `authenticated` has broad table-level privileges on `user_grants` and `role_grants`. The migration must replace those table-level privileges with column-level privileges for the columns that PostgREST should continue to expose. The `capability_set` column should not be granted to `authenticated`.

Current RLS policies on `user_grants` and `role_grants` should also exclude rows that have no current `capability`. This prevents grants whose authority is carried only in `capability_set` from appearing in current grant-management views and prevents existing PostgREST mutation paths from updating or deleting them. Insert and update policies should also reject `capability IS NULL` rows for current PostgREST callers; otherwise a hidden GraphQL-only row could still be created through an old path and occupy the unique grant key. The service role and GraphQL server retain access to the full row.

## Capability Set Values

The new values are operation-specific capabilities, not product roles.

Initial values:

```text
catalog_read
journal_append
spec_edit
billing
manage_grants
assume
```

Checks are literal membership tests:

```text
read catalog/spec/data metadata  -> catalog_read
append through low-level paths    -> journal_append
create or modify specs            -> spec_edit
view or modify billing state      -> billing
modify grants                     -> manage_grants
traverse role_grants              -> assume
```

Product roles expand into these individual capabilities. They are not stored as roles in `capability_set`.

Example product profile expansions:

```text
Viewer  = {catalog_read}
Editor  = {catalog_read, spec_edit}
Billing = {billing}
Manager = {manage_grants, billing}
Owner   = {catalog_read, journal_append, spec_edit,
           billing, manage_grants, assume}
```

The exact profile names and expansions are product/API decisions. The storage model stores the expanded capability list.

`manage_grants` is an intentionally separate capability. A manager can change grants and may be able to grant additional capabilities to themselves, but the initial grant does not directly authorize catalog reads, journal appends, spec edits, etc. This distinction is useful for product presentation, audit trails, and APIs that want to check whether a user currently has direct catalog or billing authority without treating grant management as an implicit read/edit/billing capability.

Whether a grant-management API allows a manager to escalate themselves is a policy decision for that API. The important storage distinction is that `manage_grants` does not itself authorize non-grant operations.

## Canonical Explicit Capability Sets

Some capability combinations are not valid product states. For example, `spec_edit` without `catalog_read` is not a useful editor state for authoring derivations: testing a derivation normally requires `flowctl preview`, which must read source data. It also does not provide a meaningful data-access boundary, because the same user could publish a materialization that sends the data to a destination they control.

The design does not model these as read-time implications. It also does not use the scalar `capability` projection to repair an incomplete `capability_set` array. Instead, the explicit capability set must be canonical on its own.

Initial rule:

```text
spec_edit      requires catalog_read
```

These rules apply at write time. Authorization reads do not expand or imply capabilities. We should enforce these invariants at the API layer, and possibly also as a check constraint in the database. A non-canonical explicit set such as `{spec_edit}` is not inherently impossible to evaluate, but it is not a valid product state for this authorization model.

## Why normalize at write-time, not read-time

I wasn't able to come up with a hard correctness argument against read-time widening of capabilities: It *is* technically possible to define an authorization evaluator where `spec_edit` implies `catalog_read` at read time. OTOH, it certainly is not required, and the arguments against it come down to the following:

### The data is not self-describing

With write-time validation, a grant row contains the complete set of explicit capabilities it authorizes. To answer "what does this grant authorize?", you only need the row itself.

With read-time implication, a grant row contains a partial set. The complete set is the stored capabilities *plus* whatever implication rules produce at evaluation time. To answer "what does this grant authorize?", you must read the row and apply the implication rules that are current for that evaluator. You cannot determine what the explicit grant authorizes by inspecting the grant alone, nor can you easily answer what it authorized at some point in the past.

### Separation of concerns

The traversal algorithm should only handle graph structure: which edges exist, how to follow them, and how to intersect capabilities. The API layer should handle product rules: which explicit capability sets are valid product states.

Read-time normalization breaks that boundary. Traversal must combine an incoming capability set with an edge capability set. If `spec_edit` implies `catalog_read` during evaluation, the traversal code must also decide when to apply that rule: to each side before intersection, to the intersection result, or only later when checking a requested operation. Those choices can be specified, but they are extra authorization semantics inside graph traversal. With write-time normalization, there is no phase choice: every edge already stores the capabilities it can carry, and traversal computes only `incoming ∩ edge`.

### Lock-in

Write-time validation is easy to relax later because it constrains what we store now without constraining what we can decide later. If we decide that `spec_edit` no longer requires `catalog_read`, we remove the write-time validation rule. Existing grants that already contain both capabilities are unaffected. New grants can omit `catalog_read` if the product allows it.

Read-time implication is hard to remove later. If `spec_edit` implies `catalog_read` at read time and we decide to stop implying it, every grant that relied on the implication to provide `catalog_read` silently loses that capability. The removal is a breaking change to every affected grant, and there is no way to know which grants were written with the expectation that the implication would supply `catalog_read` versus which ones happen to also have `catalog_read` explicitly. A migration would need to scan every grant, apply the old implication rules, and backfill the implied capabilities into the stored set before the implication can be safely removed.

## GraphQL Capability Set

GraphQL authorization reads each grant row as a set of capabilities. During the migration, that set is computed by combining the row's current scalar `capability` with the explicit `capability_set` stored on the same row.

Scalar capability projection:

```text
capability = read
  -> {catalog_read}

capability = write
  -> {catalog_read, journal_append}

capability = admin
  -> {catalog_read, journal_append, spec_edit}

capability = NULL
  -> {}
```

GraphQL row capabilities:

```text
graphql_capabilities(row) =
  scalar_projection(row.capability) ∪ row.capability_set
```

This is a compatibility adapter for existing rows plus an additive extension point for new GraphQL capabilities. It is not a read-time implication rule. After the row's GraphQL capability set has been computed, authorization uses that set exactly. It does not further expand `spec_edit` into `catalog_read`, or any other capability into another capability.

The union is what allows a new GraphQL API to add a capability to an existing grant row without migrating the row's scalar `capability`:

```text
capability = read
capability_set = {catalog_read, spec_edit}

SQL/RLS/PostgREST sees read.
GraphQL authorization sees catalog_read + spec_edit.
```

`catalog_read` appears in both inputs to the union in this example. The computed set de-duplicates it, while the explicit `capability_set` value remains canonical on its own.

As long as a row still has `capability = admin`, GraphQL authorization checks that read the new column will include the capabilities that `admin` maps to: `catalog_read`, `journal_append`, `spec_edit`, `billing`, `manage_grants`, and `assume`. Values in `capability_set` can add GraphQL-visible authority to the row, but they cannot remove authority that still comes from `capability`. Changing that behavior later requires either changing the row's scalar `capability` or first backfilling `capability_set` with the projected capabilities and then removing the scalar projection from GraphQL evaluation.

For a grant whose authority should be carried only in `capability_set`:

```text
capability = NULL
capability_set = {billing}

SQL/RLS/PostgREST sees no grant.
GraphQL authorization sees billing access.
```

## Critical Path: Direct GraphQL checks first

The existing Rust authorization logic uses the existing transitivity rule: start from the user's direct `user_grants`, treat each reached grant as effective for names under its `object_role`, and continue through `role_grants` **only from reached grants whose scalar `capability` is `admin`**. Reached `read`, `write`, and `NULL` grants remain terminal results; they do not become traversal states.

The first part of this proposal "just" changes how capabilities are computed on each reached row. It computes `scalar_projection(capability) ∪ capability_set` and tests for the capability required by the caller. Existing callers can continue to ask whether the user has the access they require for a prefix; new callers can ask for orthogonal capabilities such as `billing` or `manage_grants`. A `capability = NULL` row can therefore authorize a GraphQL-only operation such as billing, while still having no SQL/RLS/PostgREST effect and no ability to carry traversal.

## Follow-On: Assume Traversal

This proposal reserves the capability `assume` for future traversal through `role_grants`. `assume` is the answer to a different problem: bounded delegation through roles without collapsing the delegated authority back into scalar `admin`. It is not needed to support direct `billing` or `manage_grants` checks.

`assume` should not mean "take all capabilities of the target role." That interpretation collapses back into the current `admin` bundle.

The proposed meaning is:

> assume permits the current bounded capability set to be carried through role_grants.

Traversal rule:

```text
if incoming capabilities do not contain assume:
  do not traverse role_grants from this state

next capabilities = incoming capabilities ∩ edge capabilities
```

Examples:

```text
user -> contractor/:  {catalog_read, assume}
contractor/ -> acme/: {catalog_read, spec_edit}

effective on acme/:   {catalog_read}
```

The user can act through `contractor/`, but only with the width already present on the incoming grant.

```text
user -> contractor/:  {catalog_read, spec_edit, assume}
contractor/ -> acme/: {catalog_read, spec_edit}

effective on acme/:   {catalog_read, spec_edit}
```

An editor-shaped assumption can carry editor-shaped authority.

```text
user -> acme/:        current admin projection, including assume
acme/ -> shared/:     {catalog_read}

effective on shared/: {catalog_read}
traversal stops because assume was not on the shared edge
```

This matches the current behavior where traversal continues only through `admin` edges but terminal read/write grants can still be effective results.

## Traversal Is Per Capability State

Once `assume` traversal exists, the traversal state must include both the prefix and the capabilities carried to that prefix:

```text
(prefix, capabilities)
```

The evaluator must not union capabilities from multiple paths to the same prefix before deciding whether traversal can continue.

Otherwise two independent paths could combine depth and width:

```text
path A to prefix: {assume}
path B to prefix: {spec_edit}
```

If these are unioned before traversal, the algorithm creates `{assume, spec_edit}` even though no single path granted that combination.

After traversal, effective capabilities for answering "what can the user do at this prefix?" can be unioned across all reached states. During traversal, a state may be pruned only when another state at the same prefix has a superset of its capabilities.

## Role Edges Narrow Capability Width

When `assume` traversal follows a role edge, the next state carries only the capabilities present on both the incoming state and the edge:

```text
incoming: {catalog_read, assume}
edge:     {catalog_read, spec_edit, billing}
next:     {catalog_read}
```

This rule prevents a role edge from widening authority that was not already present on the incoming path. Future capability additions to a role edge should not automatically affect every existing assumer of that edge. `assume` controls depth; the remaining capabilities control width.

## Migration Plan

### Phase 1: Database migration

- Add `capability_set` columns, make them invisible to PostgREST, etc.

No existing authorization behavior changes in this phase.

### Phase 2: Direct GraphQL Authorization

- Extend snapshot loading to include `capability_set`.
- Update GraphQL authorization to compute `scalar_projection(capability) ∪ capability_set`.
- Preserve existing GraphQL check behavior for rows with empty `capability_set`.
- Add parity tests showing rows with empty `capability_set` produce the GraphQL capabilities expected from their scalar `capability`.
- Make `capability` nullable on both grant tables.
- Write GraphQL-only grants, such as billing-only grants, as `capability = NULL` and `capability_set = {billing}`.
- Keep existing billing/PostgREST surfaces on current `admin` until migrated or removed.

After this phase, product/API work can migrate independently onto the new authorization path. Billing APIs can ask for `billing`, publication APIs can ask for `spec_edit`, and grant-management APIs can ask for `manage_grants` and write normalized `capability_set` arrays. Those are consumers of the authorization model, not separate authorization migration seams.

### Phase 3: GraphQL-Owned Traversal

- Add `assume` traversal with per-state intersection semantics when GraphQL is ready to own grant traversal end-to-end, or earlier if another API needs bounded role delegation.
- The scalar projection of `admin` includes `assume`.
- The scalar projections of `read` and `write` do not include `assume`.
- Add graph tests covering current admin-chain behavior, read/write terminal behavior, multi-path non-composition, and edge narrowing.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Orthogonal Authorization #2921

Context

Objectives

Non-Objectives

Data Model

PostgREST Isolation

Capability Set Values

Canonical Explicit Capability Sets

Why normalize at write-time, not read-time

The data is not self-describing

Separation of concerns

Lock-in

GraphQL Capability Set

Critical Path: Direct GraphQL checks first

Follow-On: Assume Traversal

Traversal Is Per Capability State

Role Edges Narrow Capability Width

Migration Plan

Phase 1: Database migration

Phase 2: Direct GraphQL Authorization

Phase 3: GraphQL-Owned Traversal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: Orthogonal Authorization #2921

Description

Context

Objectives

Non-Objectives

Data Model

PostgREST Isolation

Capability Set Values

Canonical Explicit Capability Sets

Why normalize at write-time, not read-time

The data is not self-describing

Separation of concerns

Lock-in

GraphQL Capability Set

Critical Path: Direct GraphQL checks first

Follow-On: Assume Traversal

Traversal Is Per Capability State

Role Edges Narrow Capability Width

Migration Plan

Phase 1: Database migration

Phase 2: Direct GraphQL Authorization

Phase 3: GraphQL-Owned Traversal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions