A web interface for the Subscript handwritten text recognition (HTR) pipeline. Upload images of handwritten manuscripts, run AI-powered transcription, correct results in a built-in page editor, and export searchable PDFs — all from the browser.
Subscript App wraps the Subscript Python module in a full-stack web application. The core pipeline uses Kraken for layout segmentation, LLMs (Google Gemini, OpenAI, Anthropic) for handwriting transcription, and ReportLab for searchable PDF generation. This application adds a web-based document management layer, an integrated PAGE XML editor for correcting transcription output, and asynchronous job processing so long-running transcriptions don't block the interface.
The application runs as a set of Docker containers:
| Service | Description | Port |
|---|---|---|
| frontend | Web UI | 8080 (configurable) |
| backend | FastAPI REST API | 8001 |
| worker | Celery task worker for async transcription jobs | — |
| page-editor | PHP-based PAGE XML editor (nw-page-editor) | 8002 |
| redis | Message broker for Celery | 6379 |
Data is persisted in a SQLite database (Docker volume) and a local documents/ directory for manuscript images and output files.
- Backend: Python, FastAPI, Celery, SQLite
- Frontend: JavaScript, CSS, HTML
- Page Editor: PHP (nw-page-editor)
- HTR Engine: Subscript (Kraken + LLM transcription)
- Infrastructure: Docker Compose, Redis
- Authentication: LDAP/LDAPS support, local admin accounts
Login screen with LDAP and guest access tabs, set against a manuscript background.
The main dashboard listing transcribed documents with thumbnail previews, status indicators, and actions for editing, sharing, and downloading files.
Drag-and-drop upload interface for adding new manuscript images (JPG/PNG). Multiple images can optionally be merged into a single PDF.
Configuration panel for tuning the transcription pipeline — select the LLM model and temperature, customize the system prompt, adjust image preprocessing (resize, contrast, binarize/invert), and choose the layout segmentation model.
Side-by-side editor showing the original manuscript image with segmented line regions (left) and the corresponding transcription text (right). Lines can be selected and corrected individually, then saved back to update the PDF.
A searchable PDF generated by the pipeline. The original manuscript image is preserved while an invisible text layer enables full-text search — here, a search for "panama" highlights the match directly on the handwritten page.
- Docker and Docker Compose
- API key(s) for at least one supported LLM provider:
-
Clone the repository (including the
subscriptsubmodule):git clone --recurse-submodules https://github.com/eluhrs/subscript-app.git cd subscript-appIf you already cloned without
--recurse-submodules:git submodule update --init --recursive
-
Create your environment file:
cp example.env .env
Then edit
.envand fill in your values:# REQUIRED — generate with: python3 -c "import secrets; print(secrets.token_hex(32))" SECRET_KEY=your_generated_secret_key # At least one API key is required GEMINI_API_KEY=your_key_here OPENAI_API_KEY=your_key_here ANTHROPIC_API_KEY=your_key_here # Frontend port (default: 8080) APP_PORT=8080 # LDAP authentication (optional) LDAP_ENABLED=true LDAP_SERVER_URL=ldaps://your.ldap.server:636 LDAP_USER_DN_TEMPLATE=uid={username},dc=example,dc=com
-
Initialize the database file (Docker requires it to exist before mounting):
touch subscript.db
-
Create the documents directory:
mkdir documents
-
Build and start the application:
docker compose up -d --build
-
Create a local admin account (if not using LDAP):
./create_local_admin.sh
Once running, the application is available at:
| URL | Description |
|---|---|
http://localhost:8080 |
Web interface |
http://localhost:8001/docs |
API documentation (Swagger UI) |
http://localhost:8002 |
PAGE XML editor (also accessible from within the web UI) |
- Upload manuscript images through the web interface.
- Configure segmentation and transcription model settings.
- Run the transcription pipeline — jobs are processed asynchronously by the Celery worker.
- Review & correct results using the integrated PAGE XML editor.
- Export searchable PDFs with the original images and a hidden text layer.
Model configuration is managed through YAML config files in the config/ directory. The Subscript module supports defining multiple segmentation and transcription models with per-model settings for prompts, temperature, token costs, and provider-specific parameters. See the Subscript documentation for the full configuration reference.
- Subscript — The command-line HTR pipeline that powers this application. Can also be used standalone for batch processing.
- nw-page-editor — The PAGE XML editor integrated into this app.
- Kraken — OCR/HTR engine used for layout segmentation.
GNU General Public License v3.0
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. See LICENSE for details.





