Post-extraction trust scoring

## Observation

`Proposition.confidence` reflects the LLM's self-assessed certainty during extraction. But confidence alone is insufficient for trust decisions:

- A high-confidence extraction from an untrusted source may need skepticism
- A high-confidence proposition that contradicts well-established knowledge may need review
- A proposition corroborated by multiple independent sources (`grounding`, `sourceIds`) is more trustworthy than one from a single source

DICE extracts and stores propositions with `confidence`, `importance`, `grounding`, `sourceIds`, and `metadata` — but nothing between extraction and persistence evaluates whether a proposition should be trusted. Any post-extraction quality gating is entirely on the consumer.

## What DICE already has

- **`confidence: ZeroToOne`** on `Proposition` — LLM self-assessment, set at extraction time
- **`grounding: List<String>`** — chunk IDs supporting the proposition (corroboration signal)
- **`sourceIds: List<String>`** — for abstracted propositions, the IDs of source propositions
- **`metadata: Map<String, Any>`** — could carry source authority signals
- **`PropositionPipeline`** — extract → revise → persist, but no trust gate between revise and persist

## The question

Should DICE have a trust evaluation layer between extraction and persistence?

Some possibilities:

1. **Metadata convention** — extraction prompts populate `metadata["trustSignals"]` with source/corroboration hints. Consumers interpret them. No code changes.

2. **TrustSignal SPI** — a `fun interface TrustSignal { fun evaluate(proposition: Proposition): Double }` that scores propositions on a 0-1 scale. Built-in signals could consume fields DICE already produces:
   - Extraction confidence passthrough
   - Corroboration count (how many distinct sources in `grounding`)
   - Source authority (from `metadata`, if provenance tracking (#17) is adopted)

3. **TrustEvaluator in the pipeline** — `PropositionPipeline` accepts an optional evaluator that scores propositions before persistence. Propositions below a threshold are dropped or routed to a review status.

4. **Trust as a Proposition field** — add a `trustScore: Double?` field alongside `confidence`. Confidence = "how certain is the extraction," trust = "how much should we believe it given external signals." Orthogonal dimensions.

## Where trust scoring would matter

| Signal | Source | Cost |
|--------|--------|------|
| Extraction confidence | `proposition.confidence` | Free (already extracted) |
| Corroboration | `grounding.size`, `sourceIds.size` | Free (already tracked) |
| Source authority | `metadata` or provenance (#17) | Free if provenance exists |
| Semantic consistency | LLM call against existing propositions | Expensive |
| External verification | Domain-specific API/check | Variable |

The cheap signals (confidence, corroboration, source authority) could filter a meaningful percentage of low-quality extractions before any expensive checks run.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post-extraction trust scoring #14

Observation

What DICE already has

The question

Where trust scoring would matter

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Signal	Source	Cost
Extraction confidence	`proposition.confidence`	Free (already extracted)
Corroboration	`grounding.size`, `sourceIds.size`	Free (already tracked)
Source authority	`metadata` or provenance (#17)	Free if provenance exists
Semantic consistency	LLM call against existing propositions	Expensive
External verification	Domain-specific API/check	Variable

Post-extraction trust scoring #14

Description

Observation

What DICE already has

The question

Where trust scoring would matter

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions