Subscript App

A web interface for the Subscript handwritten text recognition (HTR) pipeline. Upload images of handwritten manuscripts, run AI-powered transcription, correct results in a built-in page editor, and export searchable PDFs — all from the browser.

Overview

Subscript App wraps the Subscript Python module in a full-stack web application. The core pipeline uses Kraken for layout segmentation, LLMs (Google Gemini, OpenAI, Anthropic) for handwriting transcription, and ReportLab for searchable PDF generation. This application adds a web-based document management layer, an integrated PAGE XML editor for correcting transcription output, and asynchronous job processing so long-running transcriptions don't block the interface.

Architecture

The application runs as a set of Docker containers:

Service	Description	Port
frontend	Web UI	`8080` (configurable)
backend	FastAPI REST API	`8001`
worker	Celery task worker for async transcription jobs	—
page-editor	PHP-based PAGE XML editor (nw-page-editor)	`8002`
redis	Message broker for Celery	`6379`

Data is persisted in a SQLite database (Docker volume) and a local documents/ directory for manuscript images and output files.

Tech Stack

Backend: Python, FastAPI, Celery, SQLite
Frontend: JavaScript, CSS, HTML
Page Editor: PHP (nw-page-editor)
HTR Engine: Subscript (Kraken + LLM transcription)
Infrastructure: Docker Compose, Redis
Authentication: LDAP/LDAPS support, local admin accounts

Screenshots

Login Page

Login screen with LDAP and guest access tabs, set against a manuscript background.

Document Dashboard

The main dashboard listing transcribed documents with thumbnail previews, status indicators, and actions for editing, sharing, and downloading files.

Document Upload

Drag-and-drop upload interface for adding new manuscript images (JPG/PNG). Multiple images can optionally be merged into a single PDF.

Transcription Settings

Configuration panel for tuning the transcription pipeline — select the LLM model and temperature, customize the system prompt, adjust image preprocessing (resize, contrast, binarize/invert), and choose the layout segmentation model.

PAGE XML Editor

Side-by-side editor showing the original manuscript image with segmented line regions (left) and the corresponding transcription text (right). Lines can be selected and corrected individually, then saved back to update the PDF.

Searchable PDF Output

A searchable PDF generated by the pipeline. The original manuscript image is preserved while an invisible text layer enables full-text search — here, a search for "panama" highlights the match directly on the handwritten page.

Prerequisites

Docker and Docker Compose
API key(s) for at least one supported LLM provider:

Installation & Setup

Clone the repository (including the subscript submodule):

git clone --recurse-submodules https://github.com/eluhrs/subscript-app.git
cd subscript-app

If you already cloned without --recurse-submodules:

git submodule update --init --recursive

Create your environment file:

cp example.env .env

Then edit .env and fill in your values:

# REQUIRED — generate with: python3 -c "import secrets; print(secrets.token_hex(32))"
SECRET_KEY=your_generated_secret_key

# At least one API key is required
GEMINI_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here

# Frontend port (default: 8080)
APP_PORT=8080

# LDAP authentication (optional)
LDAP_ENABLED=true
LDAP_SERVER_URL=ldaps://your.ldap.server:636
LDAP_USER_DN_TEMPLATE=uid={username},dc=example,dc=com

Initialize the database file (Docker requires it to exist before mounting):
```
touch subscript.db
```
Create the documents directory:
```
mkdir documents
```
Build and start the application:
```
docker compose up -d --build
```
Create a local admin account (if not using LDAP):
```
./create_local_admin.sh
```

Usage

Once running, the application is available at:

URL	Description
`http://localhost:8080`	Web interface
`http://localhost:8001/docs`	API documentation (Swagger UI)
`http://localhost:8002`	PAGE XML editor (also accessible from within the web UI)

Typical Workflow

Upload manuscript images through the web interface.
Configure segmentation and transcription model settings.
Run the transcription pipeline — jobs are processed asynchronously by the Celery worker.
Review & correct results using the integrated PAGE XML editor.
Export searchable PDFs with the original images and a hidden text layer.

Configuration

Model configuration is managed through YAML config files in the config/ directory. The Subscript module supports defining multiple segmentation and transcription models with per-model settings for prompts, temperature, token costs, and provider-specific parameters. See the Subscript documentation for the full configuration reference.

Related Projects

Subscript — The command-line HTR pipeline that powers this application. Can also be used standalone for batch processing.
nw-page-editor — The PAGE XML editor integrated into this app.
Kraken — OCR/HTR engine used for layout segmentation.

License

GNU General Public License v3.0

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
.agent/workflows		.agent/workflows
config		config
documents/demo_template		documents/demo_template
page-editor		page-editor
screenshots		screenshots
server		server
subscript @ f6318ba		subscript @ f6318ba
web		web
.bashrc		.bashrc
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
TIMELINE.md		TIMELINE.md
create_local_admin.sh		create_local_admin.sh
docker-compose.yml		docker-compose.yml
example.env		example.env
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Subscript App

Overview

Architecture

Tech Stack

Screenshots

Login Page

Document Dashboard

Document Upload

Transcription Settings

PAGE XML Editor

Searchable PDF Output

Prerequisites

Installation & Setup

Usage

Typical Workflow

Configuration

Related Projects

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Subscript App

Overview

Architecture

Tech Stack

Screenshots

Login Page

Document Dashboard

Document Upload

Transcription Settings

PAGE XML Editor

Searchable PDF Output

Prerequisites

Installation & Setup

Usage

Typical Workflow

Configuration

Related Projects

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages