mac-ocr

A macOS command-line tool that reads text from images and PDFs, and creates searchable PDFs.
Runs entirely on your Mac with Apple's Vision framework; nothing is uploaded.

Tip

Useful for AI agents too: instead of spending vision tokens reading documents, an agent can run mac-ocr locally for free. A skill is bundled so agents know how to use it.

Features

Read text from an image: mac-ocr photo.png
Read text from many images: mac-ocr *.png
Stream text from a PDF, page by page: mac-ocr scan.pdf --format jsonl
Turn an image into a searchable PDF: mac-ocr searchable-pdf photo.png → photo.ocr.pdf
Add a selectable text layer to a scanned PDF: mac-ocr searchable-pdf scan.pdf → scan.ocr.pdf

Install

npm install -g mac-ocr

Or run it without installing:

npx mac-ocr receipt.jpg

Requirements: macOS 10.15+. The npm package ships a prebuilt universal binary, so no Xcode or Swift toolchain is needed.

Recognize text

OCR is the default action — you don't need a subcommand:

mac-ocr receipt.jpg                 # text → stdout
mac-ocr page1.png page2.png         # multiple images
mac-ocr scan.pdf                    # multi-page PDF
cat screenshot.png | mac-ocr        # stdin
mac-ocr https://example.com/a.png   # URL (simple GET)

Default output is plain text. Use JSON when you need bounding boxes, confidence, or page metadata:

mac-ocr receipt.jpg --format json
mac-ocr document.pdf --format jsonl   # one JSON object per page, streamed

PDF pages stream as they're recognized, so with a large document you see the first page's text right away.

Save text to files

mac-ocr ~/Screenshots/*.png -o '[dir]/[name].txt'   # a .txt next to each image
mac-ocr scan.pdf -o notes.md                        # recognized text to a chosen .txt/.md file
mac-ocr receipts/*.pdf -o out/                      # one file per input in out/
grep -rli "invoice" ~/Screenshots                    # then search with normal tools

-o takes a file, a directory (out/), or a filename template (all placeholders). Quote templates, since […] is a glob pattern in zsh. Whatever the extension, the content is the plain recognized text.

Create a searchable PDF

searchable-pdf takes a PDF or an image and writes a PDF that looks identical to the source but whose text is selectable and searchable. By default it writes [name].ocr.pdf next to each input — one searchable PDF per input (inputs are never merged):

mac-ocr searchable-pdf scan.pdf            # writes scan.ocr.pdf
mac-ocr searchable-pdf photo.jpg            # image → one-page photo.ocr.pdf
mac-ocr searchable-pdf *.pdf                # writes <name>.ocr.pdf for each

Use -o to control the destination — a directory, a [name] template, a fixed file, or - for stdout:

mac-ocr searchable-pdf scan.pdf -o out/              # out/scan.ocr.pdf
mac-ocr searchable-pdf scan.pdf -o '[name]-ocr.pdf'  # scan-ocr.pdf
mac-ocr searchable-pdf scan.pdf -o searchable.pdf    # fixed path
mac-ocr searchable-pdf scan.pdf -o - > scan.pdf      # stdout

A fixed path or - (stdout) takes a single input; for multiple inputs use a directory or a [name] template.

Pages that already have selectable text are skipped — only scanned pages get OCR. A PDF that needs no OCR at all passes through unchanged. To OCR every page regardless, pass --ocr-all-pages. The finer points (what survives a rewrite, how "already has text" is decided) are in docs/CLI.md.

In an interactive terminal you get a live [page/total] progress counter. Piped or redirected runs are silent on success, so scripts stay clean.

Options

Both OCR and searchable-pdf accept the recognition options:

Flag	Effect
`--fast`	Faster, lower-accuracy recognition (details)
`--password <password>`	Password for an encrypted PDF (or set `MAC_OCR_PDF_PASSWORD`)
`-l, --language <code>`	Recognition language (BCP-47, repeatable). e.g. `-l en-US -l ja-JP`
`-c, --confidence <0–1>`	Drop observations below this confidence
`-w, --custom-words <word>`	Add custom vocabulary (repeatable)
`--custom-words-file <path>`	Custom vocabulary file, one word per line
`--no-language-correction`	Disable language correction
`--min-text-height <0–1>`	Ignore text shorter than this fraction of image height
`--pdf-dpi <auto\|72–600>`	PDF rasterization DPI (default `auto`)
`--roi <x,y,w,h>`	Region of interest: restrict recognition to a normalized region (top-left origin)

`mac-ocr <file>`

Flag	Effect
`-f, --format <text\|json\|jsonl>`	Output format (default `text`)
`-o, --output <path>`	Output path, directory, or template (`[name]`, `[ext]`, `[dir]`, `[page]`). Default: stdout. Any extension — e.g. `.txt` or `.md`.
`--max-candidates <1–10>`	Alternative text candidates per observation

`mac-ocr searchable-pdf <file>`

Flag	Effect
`-o, --output <dest>`	Output path, `[name]` template, directory, or `-` for stdout. Default: `[name].ocr.pdf` next to each input.
`--ocr-all-pages`	OCR every page, including pages that already have selectable text (skipped by default)

List the recognition languages available on your macOS version with mac-ocr languages (add --fast for the fast recognizer's set).

See docs/CLI.md for the full reference — every command and flag, plus the JSON output schema.

Node.js API

The same package exposes a typed, promise-based API that wraps the binary. Inputs are image or PDF bytes — read files or fetch URLs in your own code and pass the bytes:

npm install mac-ocr

import fs from 'node:fs/promises'
import { ocr, createSearchablePdf, supportedLanguages } from 'mac-ocr'

// Recognize text in an image or single-page PDF
const result = await ocr(await fs.readFile('receipt.jpg'))
console.log(result.text)
for (const { text, confidence, boundingBox } of result.observations) { /* … */ }

// Multi-page PDF: stream pages as they finish…
for await (const page of ocr.pages(await fs.readFile('book.pdf'))) {
    console.log(page.page, '/', page.pageCount, page.text)
}
// …or collect the whole thing into an array
const pages = await Array.fromAsync(ocr.pages(await fs.readFile('book.pdf')))

// Build a searchable PDF (returns the PDF bytes)
const pdf = await createSearchablePdf(await fs.readFile('scan.pdf'), { fast: true })
await fs.writeFile('scan.ocr.pdf', pdf)

// Recognition languages supported on this macOS version (for ocr and createSearchablePdf)
const languages = await supportedLanguages()

Options mirror the CLI flags (like { fast: true } above), plus an AbortSignal for cancellation. Failures throw a MacOcrError with a kind you can branch on. See docs/NODE.md for every option, the result types, and error handling.

How it works

mac-ocr is a native Swift binary built on Apple's Vision framework (VNRecognizeTextRequest). Recognition happens entirely on-device — nothing is uploaded. The searchable-PDF layer is invisible text drawn with Core Graphics + Core Text, placed word by word where Vision found each word.

Agent Skills

The package bundles an agent skill covering the CLI and Node API — set up skills-npm in your project and coding agents discover it automatically.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
Sources		Sources
Tests		Tests
docs		docs
scripts		scripts
skills/mac-ocr		skills/mac-ocr
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.nvmrc		.nvmrc
.swift-format		.swift-format
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md
eslint.config.ts		eslint.config.ts
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mac-ocr

Features

Install

Recognize text

Save text to files

Create a searchable PDF

Options

`mac-ocr <file>`

`mac-ocr searchable-pdf <file>`

Node.js API

How it works

Agent Skills

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mac-ocr

Features

Install

Recognize text

Save text to files

Create a searchable PDF

Options

mac-ocr <file>

mac-ocr searchable-pdf <file>

Node.js API

How it works

Agent Skills

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`mac-ocr <file>`

`mac-ocr searchable-pdf <file>`

Packages