A lightning-fast, privacy-first web app for offline text extraction. Paste (Ctrl+V) or drop any image to instantly generate plain text and a searchable PDF entirely within your browser using Tesseract.js. No server uploads required. Fast, secure, and ready to use! π
- Instant Ctrl+V Support: No need to click through clunky upload menus. Just copy an image to your clipboard and paste it directly into the app to start the extraction.
- Searchable PDF Generation: Doesn't just give you raw text! It magically overlays invisible text boxes over the original image, generating a fully searchable and selectable PDF on the fly.
- Multi-Language Support: Automatically recognizes and processes multiple European languages simultaneously (English, Italian, French, Spanish, German), handling special characters flawlessly.
- 100% Client-Side & Private: Built entirely with HTML, CSS, and Vanilla JavaScript. Your images and documents are processed locally in your browser. Nothing is ever uploaded to a cloud server.
- Offline Capable: Run the app entirely without an internet connection using local assets and language packs.
- Initialization: Upon loading or pasting an image, the app initializes a local
Tesseract.jsWeb Worker, loading the necessary language data packs into the browser's memory. - Processing: The image data is passed to the Web Assembly (WASM) core of Tesseract, which scans the pixels, recognizes characters, and calculates the layout.
- Concurrent Output: Tesseract simultaneously returns both the extracted raw text and the ArrayBuffer data for the PDF document.
- Blob Conversion: The app instantly converts the PDF data into a Blob, creating a downloadable file directly from your browser memory, while displaying the text on-screen concurrently.
- Zero-Setup Friction: Designed to be an immediate utility tool. You don't need to log in, select languages from dropdowns, or wait in server queues. Paste and go.
- Modern UI/UX: Clean, responsive design powered by Tailwind CSS that keeps you informed of the OCR progress every step of the way without visual clutter.
- For Sensitive Documents: The perfect solution when you need to extract text from private documents, bank statements, or confidential notes, and you don't trust free online OCR services that might store and harvest your files.
Simply open the Live Demo link on any browser, paste an image, and grab your text!
If you have downloaded the repository to use it completely offline, you cannot simply double-click the index.html file. Because the app uses advanced Web Workers to process images, modern browsers block them from running via the standard file:/// protocol due to CORS security restrictions.
To run it locally, you need to spin up a quick local web server. The easiest way is using Python.
- Windows: Open PowerShell as Administrator and run:
choco install python
(Requires Chocolatey) - macOS: Open Terminal and run:
brew install python
(Requires Homebrew) - Linux (Debian/Ubuntu): Open Terminal and run:
sudo apt update && sudo apt install python3
- Open your terminal or command prompt.
- Navigate to the folder where you extracted this project (e.g.,
cd path/to/RapidOCR). - Run the following command:
- On Windows:
python -m http.server 8000 - On Mac/Linux:
python3 -m http.server 8000
- On Windows:
- Open your browser and type
http://localhost:8000in the address bar. The app is now running securely and fully offline!
If the OCR seems stuck at "Loading..." during your very first run on the live version, ensure your internet connection is active so the browser can cache the Tesseract language data files. Once cached, subsequent runs will be instantaneous. If you are running the offline local server version, ensure your tessdata folder contains all the downloaded .traineddata.gz files.


