Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
05bd89e
chore(schema): install zod
YASSERRMD Jun 7, 2026
202bc05
feat(schema): add invoice v1 zod schema with field confidence wrapper
YASSERRMD Jun 7, 2026
c879a02
feat(schema): add raw json contract and invoice mapper
YASSERRMD Jun 7, 2026
09a999f
test(schema): cover schema parse, mapper null handling, and uncertain…
YASSERRMD Jun 7, 2026
c75c134
feat(pass2): add versioned extraction prompt with embedded schema
YASSERRMD Jun 7, 2026
cd255ac
feat(pass2): add json repair ladder with fence strip, brace extractio…
YASSERRMD Jun 7, 2026
dfde7ff
test(pass2): cover repair ladder on malformed fixtures
YASSERRMD Jun 7, 2026
b46853f
feat(pass2): add pass2 runner with single corrective re-prompt
YASSERRMD Jun 7, 2026
4458ac0
test(pass2): cover corrective re-prompt and terminal failure
YASSERRMD Jun 7, 2026
988ae3d
docs(schema): document field confidence semantics and repair ladder
YASSERRMD Jun 7, 2026
8bfff4a
feat(rules): add rule engine with finding aggregation and field status
YASSERRMD Jun 7, 2026
72c2e94
feat(rules): add minor-unit money helpers with tolerance
YASSERRMD Jun 7, 2026
33ab4c0
feat(rules): add line, subtotal, total, and tax-consistency arithmeti…
YASSERRMD Jun 7, 2026
4211f8d
feat(rules): add uae trn, vat default, and currency whitelist rules
YASSERRMD Jun 7, 2026
b63d191
feat(rules): add date, non-negative, empty-items, and uncertain-path …
YASSERRMD Jun 7, 2026
466ba20
feat(rules): add review state folding with verdict
YASSERRMD Jun 7, 2026
5daa98d
feat(rules): add rule registry and validateInvoice entry point
YASSERRMD Jun 7, 2026
a6b6946
test(rules): cover all rules, engine aggregation, and review verdict …
YASSERRMD Jun 7, 2026
6351941
docs(rules): document rule catalog with ids and severities
YASSERRMD Jun 7, 2026
e07734b
refactor(ui): extract shared design tokens for review surfaces
YASSERRMD Jun 7, 2026
2254d9e
feat(review): add pure review state with live revalidation, audit tra…
YASSERRMD Jun 7, 2026
cb516e3
feat(review): add useReview reducer hook
YASSERRMD Jun 7, 2026
664187f
feat(review): add field component with status dot, findings, and inli…
YASSERRMD Jun 7, 2026
a8bf188
feat(review): add review form with grouped fields, editable line item…
YASSERRMD Jun 7, 2026
7fad626
test(review): cover field status, live revalidation, approval gating,…
YASSERRMD Jun 7, 2026
d6cd9d4
feat(review): add image viewer with zoom and pan controls
YASSERRMD Jun 7, 2026
cc7a43b
feat(review): add split review screen with collapsible transcript drawer
YASSERRMD Jun 7, 2026
020cc5a
chore(export): add fflate for zip bundling
YASSERRMD Jun 7, 2026
76853a5
feat(export): add canonical json export with provenance envelope and …
YASSERRMD Jun 7, 2026
62b9a02
test(export): cover serializer stability and schema round-trip
YASSERRMD Jun 7, 2026
69a4c05
feat(export): add header and line-item csv writers with rfc4180 quoti…
YASSERRMD Jun 7, 2026
4125985
test(export): cover rfc4180 quoting, bom, and arabic text
YASSERRMD Jun 7, 2026
331cb7d
feat(export): add filename convention with arabic-safe slugs and coll…
YASSERRMD Jun 7, 2026
9a71c5a
test(export): cover slug fallback and collision suffixes
YASSERRMD Jun 7, 2026
b70f066
feat(export): add clipboard json and tsv copy
YASSERRMD Jun 7, 2026
e222492
feat(export): add batch zip export gated on approval
YASSERRMD Jun 7, 2026
3e8a2d0
test(export): cover approval gating and zip bundling
YASSERRMD Jun 7, 2026
3cf366f
chore(store): add dexie, dexie-react-hooks, and fake-indexeddb
YASSERRMD Jun 7, 2026
f73c424
feat(store): add dexie schema v1 with document, page, and edit reposi…
YASSERRMD Jun 7, 2026
ef01e1f
feat(store): add content-hash dedup on upload
YASSERRMD Jun 7, 2026
3b48596
feat(queue): add sequential stage machine with reload/retry resume
YASSERRMD Jun 7, 2026
d7b616c
test(queue): cover stage transitions, failure, and resume
YASSERRMD Jun 7, 2026
00bcaef
test(store): cover repository crud, dedup, and cascade delete
YASSERRMD Jun 7, 2026
57c17c7
feat(store): add storage budget estimate with warning threshold
YASSERRMD Jun 7, 2026
59aa0cf
feat(store): add livequery hooks for documents and queue
YASSERRMD Jun 7, 2026
63de816
feat(ui): add document list with status chips, retry, and confirmed d…
YASSERRMD Jun 7, 2026
08e654b
docs(store): document schema versioning and migration policy
YASSERRMD Jun 7, 2026
9a54983
feat(perf): add startup benchmark and per-page time estimates
YASSERRMD Jun 7, 2026
b995d0e
feat(perf): add webgpu device-lost recovery ladder with wasm demotion
YASSERRMD Jun 7, 2026
8f18324
feat(worker): apply device-lost recovery on inference failure
YASSERRMD Jun 7, 2026
485d02f
feat(perf): add user-confirmable token budget auto-tune
YASSERRMD Jun 7, 2026
b5b74ce
feat(ui): add stats drawer with stage timings
YASSERRMD Jun 7, 2026
213a77f
fix(queue): mark in-flight item resumable on tab close
YASSERRMD Jun 7, 2026
69f0664
docs(perf): document recovery ladder and tuning knobs
YASSERRMD Jun 7, 2026
eec3331
feat(pwa): add manifest and app shell service worker, excluding model…
YASSERRMD Jun 7, 2026
8338dfc
feat(ui): surface version, model, and prompt versions in footer
YASSERRMD Jun 7, 2026
b2750ba
feat(ui): add global error boundary with copyable diagnostics
YASSERRMD Jun 7, 2026
ffc749c
feat(ui): add first-run welcome with demo loader and failure cards
YASSERRMD Jun 7, 2026
b037dc1
feat(demo): add clean, bilingual, multi-page, and broken-math demo in…
YASSERRMD Jun 7, 2026
a1e4b7f
test(a11y): add axe checks and version/error-boundary tests
YASSERRMD Jun 7, 2026
e54fc41
chore(ci): add pr pipeline for typecheck lint test build
YASSERRMD Jun 7, 2026
70acd5a
chore(ci): add release workflow with dist artifact
YASSERRMD Jun 7, 2026
7b14351
docs: finalize readme with architecture svg, hardware matrix, and pri…
YASSERRMD Jun 7, 2026
79fa46f
Merge pull request #7 from YASSERRMD/phase-07-pass2-extraction
YASSERRMD Jun 7, 2026
ea18d91
Merge pull request #8 from YASSERRMD/phase-08-validation
YASSERRMD Jun 7, 2026
d7a5234
Merge pull request #9 from YASSERRMD/phase-09-review-ui
YASSERRMD Jun 7, 2026
8c9388e
Merge pull request #10 from YASSERRMD/phase-10-export
YASSERRMD Jun 7, 2026
e9f40d4
Merge pull request #11 from YASSERRMD/phase-11-batch-persistence
YASSERRMD Jun 7, 2026
8d86dd9
Merge pull request #12 from YASSERRMD/phase-12-perf-hardening
YASSERRMD Jun 7, 2026
9531cd9
Merge pull request #13 from YASSERRMD/phase-13-release
YASSERRMD Jun 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
name: CI

on:
pull_request:
push:
branches: [main]

jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
- run: npm ci
- run: npm run typecheck
- run: npm run lint
- run: npm run test
- run: npm run build
31 changes: 31 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Release

on:
push:
tags: ['v*']

permissions:
contents: write

jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
- run: npm ci
- run: npm run build
- name: Bundle dist
run: tar -czf faturlens-dist-${{ github.ref_name }}.tar.gz -C dist .
- uses: actions/upload-artifact@v4
with:
name: faturlens-dist
path: faturlens-dist-${{ github.ref_name }}.tar.gz
- name: Create GitHub release
uses: softprops/action-gh-release@v2
with:
files: faturlens-dist-${{ github.ref_name }}.tar.gz
generate_release_notes: true
71 changes: 45 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,47 @@
# Faturlens

[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](./LICENSE)
[![CI](https://github.com/YASSERRMD/Faturlens/actions/workflows/ci.yml/badge.svg)](https://github.com/YASSERRMD/Faturlens/actions/workflows/ci.yml)

Browser-native invoice OCR and structured extraction. A transformer
vision-language model (`LiquidAI/LFM2.5-VL-1.6B-ONNX`) runs **fully client-side**
via WebGPU, with a WASM CPU fallback. No server, no API, no data leaves the
machine — the app is offline-first after the one-time model download.
machine — the app is offline-first (installable PWA) after the one-time model
download.

## Architecture

```
┌──────────────────────────────────────────────────────────────────┐
│ Main thread (UI) │
│ capability ─► model gate ─► ingest ─► review ─► export │
└───────────────┬───────────────────────────────────┬──────────────┘
│ ImageBitmaps / prompts │ tokens / stats
▼ ▲
┌──────────────────────────────────────────────────────────────────┐
│ Inference Web Worker │
│ Cache API bytes ─► ORT sessions (WebGPU | WASM) │
│ Pass 1: full markdown transcription │
│ Pass 2: schema-constrained JSON extraction │
└──────────────────────────────────────────────────────────────────┘
│ extraction
Deterministic validation layer (pure TS, zero ML) ─► human review
```
![Architecture](docs/architecture.svg)

A two-pass pipeline runs entirely in a dedicated Web Worker: **Pass 1**
transcribes the whole invoice to Markdown; **Pass 2** extracts schema-constrained
JSON. A deterministic, zero-ML validation layer then gates every result and
routes anything suspect to human review.

> Diagram is a placeholder; a committed SVG lands in Phase 13.
## Feature overview

- Drag-in PNG / JPEG / WebP / PDF (≤25MB, ≤20 PDF pages), EXIF-corrected, tiled.
- Client-side VLM inference (WebGPU, WASM fallback) in an isolated worker.
- Confidence-annotated extraction with a deterministic validation layer (TRN/VAT,
arithmetic, dates, currency).
- Split review UI: zoomable page image, live re-validation on edit, approval
gating, edit audit trail.
- Export to canonical JSON (with provenance), CSV (header + line items), clipboard
TSV, or a batch zip — gated on approval.
- Batch queue + IndexedDB persistence; sessions restore with no network and no
reprocessing.
- Installable, offline-first PWA.

## Hardware targets

| Path | Hardware | Notes |
| -------- | --------------------------------------------------- | ----------------------------- |
| Primary | 16GB RAM laptops, integrated GPU (Iris Xe / Radeon) | WebGPU execution provider |
| Fallback | CPU-only browsers | WASM EP, reduced token budget |
| Path | Hardware | Throughput (measure on your machine) |
| -------- | --------------------------------------------------- | ------------------------------------ |
| Primary | 16GB RAM laptops, integrated GPU (Iris Xe / Radeon) | WebGPU EP — seconds per page |
| Fallback | CPU-only browsers | WASM EP — minutes per page |

> Throughput is reported live in the app's stats drawer; the warmup benchmark
> shows an upfront per-page estimate before processing. (Populate this table with
> your reference numbers from a real run.)

The browser tab memory ceiling is treated as a hard 4GB budget; the app aborts
gracefully beyond it.
Expand All @@ -43,21 +50,33 @@ gracefully beyond it.

Zero network calls after the model download, except explicit Hugging Face CDN
fetches during caching. No telemetry, no analytics, no external fonts, no
runtime CDN scripts.
runtime CDN scripts. The service worker never intercepts HF CDN requests.

## Development

```bash
npm ci # install
npm run dev # start the dev server
npm ci
npm run dev # dev server
npm run typecheck
npm run lint
npm run test
npm run build
npm run ci # all of the above
```

Requires Node 22 (see `.nvmrc`).

### Dev/diagnostic flags (query params)

- `?diag` — device capability report card.
- `?harness` — inference dev harness (prompt + image, streaming output, stats).
- `?model` — force the model download gate.

## Releasing

CI runs typecheck/lint/test/build on every PR. Pushing a `v*` tag builds and
attaches the `dist` bundle to a GitHub release (`.github/workflows/release.yml`).

## License

[Apache-2.0](./LICENSE)
58 changes: 58 additions & 0 deletions docs/architecture.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading