📄 Rapid-OCR 🔍

A lightning-fast, privacy-first web app for offline text extraction. Paste (Ctrl+V) or drop any image to instantly generate plain text and a searchable PDF entirely within your browser using Tesseract.js. No server uploads required. Fast, secure, and ready to use! 🚀

👉 Click here to launch the app! 👈

🚀 Features

Instant Ctrl+V Support: No need to click through clunky upload menus. Just copy an image to your clipboard and paste it directly into the app to start the extraction.
Searchable PDF Generation: Doesn't just give you raw text! It magically overlays invisible text boxes over the original image, generating a fully searchable and selectable PDF on the fly.
Multi-Language Support: Automatically recognizes and processes multiple European languages simultaneously (English, Italian, French, Spanish, German), handling special characters flawlessly.
100% Client-Side & Private: Built entirely with HTML, CSS, and Vanilla JavaScript. Your images and documents are processed locally in your browser. Nothing is ever uploaded to a cloud server.
Offline Capable: Run the app entirely without an internet connection using local assets and language packs.

🛠️ How it works

Initialization: Upon loading or pasting an image, the app initializes a local Tesseract.js Web Worker, loading the necessary language data packs into the browser's memory.
Processing: The image data is passed to the Web Assembly (WASM) core of Tesseract, which scans the pixels, recognizes characters, and calculates the layout.
Concurrent Output: Tesseract simultaneously returns both the extracted raw text and the ArrayBuffer data for the PDF document.
Blob Conversion: The app instantly converts the PDF data into a Blob, creating a downloadable file directly from your browser memory, while displaying the text on-screen concurrently.

🏆 What makes it special?

Zero-Setup Friction: Designed to be an immediate utility tool. You don't need to log in, select languages from dropdowns, or wait in server queues. Paste and go.
Modern UI/UX: Clean, responsive design powered by Tailwind CSS that keeps you informed of the OCR progress every step of the way without visual clutter.

💡 Why use this project?

For Sensitive Documents: The perfect solution when you need to extract text from private documents, bank statements, or confidential notes, and you don't trust free online OCR services that might store and harvest your files.

⚡ Getting Started

Online

Simply open the Live Demo link on any browser, paste an image, and grab your text!

Testing Locally & Offline (Python Web Server)

If you have downloaded the repository to use it completely offline, you cannot simply double-click the index.html file. Because the app uses advanced Web Workers to process images, modern browsers block them from running via the standard file:/// protocol due to CORS security restrictions.

To run it locally, you need to spin up a quick local web server. The easiest way is using Python.

1. Install Python (if you don't have it)

Windows: Open PowerShell as Administrator and run:
choco install python
(Requires Chocolatey)
macOS: Open Terminal and run:
brew install python
(Requires Homebrew)
Linux (Debian/Ubuntu): Open Terminal and run:
sudo apt update && sudo apt install python3

2. Launch the Local Server

Open your terminal or command prompt.
Navigate to the folder where you extracted this project (e.g., cd path/to/RapidOCR).
Run the following command:
- On Windows: python -m http.server 8000
- On Mac/Linux: python3 -m http.server 8000
Open your browser and type http://localhost:8000 in the address bar. The app is now running securely and fully offline!

⚠️ Troubleshooting: Language Loading

If the OCR seems stuck at "Loading..." during your very first run on the live version, ensure your internet connection is active so the browser can cache the Tesseract language data files. Once cached, subsequent runs will be instantaneous. If you are running the offline local server version, ensure your tessdata folder contains all the downloaded .traineddata.gz files.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
Rapid_OCR		Rapid_OCR
Readme_imgs		Readme_imgs
docs		docs
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📄 Rapid-OCR 🔍

👉 Click here to launch the app! 👈

🚀 Features

🛠️ How it works

🏆 What makes it special?

💡 Why use this project?

⚡ Getting Started

Online

Testing Locally & Offline (Python Web Server)

1. Install Python (if you don't have it)

2. Launch the Local Server

⚠️ Troubleshooting: Language Loading

About

Uh oh!

Releases 1

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

📄 Rapid-OCR 🔍

👉 Click here to launch the app! 👈

🚀 Features

🛠️ How it works

🏆 What makes it special?

💡 Why use this project?

⚡ Getting Started

Online

Testing Locally & Offline (Python Web Server)

1. Install Python (if you don't have it)

2. Launch the Local Server

⚠️ Troubleshooting: Language Loading

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages