PDFusion: Local PDF Analyzer & Markdown Converter 🚀

PDFusion is a comprehensive PDF analysis application that extracts text, images, and tables from uploaded PDFs and outputs everything in clean, structured Markdown format. It uses a local Python backend with FastAPI and utilizes local AI models via LiteLLM for advanced OCR.

🏔️ Technical Architecture

Core Stack

Frontend: React 18+ with TypeScript (Vite)
Backend: FastAPI (Python 3.10+)
Database: Postgres (local or homelab) with SQLite auto-fallback
AI/LLM: LiteLLM Proxy + Ollama (Local Vision Models)
PDF Extraction: PyMuPDF (fitz) + Pandas (Tables)

System Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   React App     │───▶│   FastAPI        │───▶│   LiteLLM Proxy │
│   (Frontend)    │    │   (Backend)      │    │   (Ollama)      │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                │                          │
                                ▼                          ▼
                         ┌──────────────┐         ┌──────────────┐
                         │  PostgreSQL/ │         │  Local       │
                         │  SQLite      │         │  AI Models   │
                         └──────────────┘         └──────────────┘

🚀 Getting Started

Prerequisites

Node.js (v18 or later)
Python (v3.10 or later)
Ollama (Running locally with gemma4 models)
Postgres (Optional, will fallback to local SQLite)

1. Backend Setup

Navigate to the root directory and set up the Python environment:

# Create a virtual environment
python -m venv venv

# Activate the virtual environment
# On Windows:
.\venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# Install dependencies
pip install -r server/requirements.txt

2. Frontend Setup

Install the Node.js dependencies:

npm install

3. AI Model Setup (Ollama / LiteLLM)

Ensure Ollama is installed and your vision model is pulled. If using a LiteLLM Proxy (recommended for homelab setups), ensure the endpoint is accessible.

ollama run gemma4

4. Configuration

Create a .env file in the root directory (based on .env.example).

Tip

PDFusion is configured to bypass self-signed SSL certificates for litellm.local homelab proxies automatically.

# API URL for the local backend
VITE_API_BASE_URL=http://localhost:8000

# PostgreSQL Database URL
DATABASE_URL=postgresql://user:password@localhost:5432/pdfusion

# LiteLLM Proxy / Ollama Configuration
VISION_MODEL_ID=ollama/gemma4
LITELLM_API_BASE=https://litellm.local
LITELLM_API_KEY=sk-local-proxy

🏃 Running the Application

Start the Backend Server

npm run server

This starts the FastAPI server at http://localhost:8000. It will auto-mount uploads/ and outputs/ for image serving.

Start the Frontend

npm run dev

The application will be accessible at http://localhost:5173.

🎨 'Pro' Viewer Features

The results dashboard now includes a professional-grade Markdown viewer inspired by advanced editors:

GFM Support: Full rendering of complex data tables and strikethroughs.
Math/KaTeX: Support for mathematical formulas extracted from documents.
Image Intelligence: Automatic URL rewriting ensures embedded images are served correctly from the local backend.
Local OCR: Full text extraction displayed clearly beneath every identified image.

📁 Project Structure

├── server/                # Python FastAPI Backend
│   ├── src/               # Processing logic (PDF, Vision, Table)
│   ├── uploads/           # Local storage for raw PDFs
│   ├── outputs/           # Local storage for extracted images/assets
│   ├── main.py            # FastAPI entry point
│   ├── models.py          # Database models (SQLAlchemy)
│   └── requirements.txt   # Python dependencies (Pandas, Pillow, etc.)
├── src/                   # React Frontend (Vite)
│   ├── components/        # UI Components & Icons
│   ├── services/          # API Client Layer
│   └── pages/             # Dashboard & Results Views
└── package.json           # Frontend dependencies (ReactMarkdown, RemarkGfm)

🎯 Key Features

Zero Cloud Leakage: No data leaves your machine or your local network.
Intelligent Table Reconstruction: Uses Pandas for high-fidelity table parsing.
Private Vision OCR: High-performance image-to-text via local LiteLLM proxy.
Export Ready: Clean, standardized Markdown output.

⚖️ License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
server		server
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDFusion: Local PDF Analyzer & Markdown Converter 🚀

🏔️ Technical Architecture

Core Stack

System Architecture

🚀 Getting Started

Prerequisites

1. Backend Setup

2. Frontend Setup

3. AI Model Setup (Ollama / LiteLLM)

4. Configuration

🏃 Running the Application

Start the Backend Server

Start the Frontend

🎨 'Pro' Viewer Features

📁 Project Structure

🎯 Key Features

⚖️ License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDFusion: Local PDF Analyzer & Markdown Converter 🚀

🏔️ Technical Architecture

Core Stack

System Architecture

🚀 Getting Started

Prerequisites

1. Backend Setup

2. Frontend Setup

3. AI Model Setup (Ollama / LiteLLM)

4. Configuration

🏃 Running the Application

Start the Backend Server

Start the Frontend

🎨 'Pro' Viewer Features

📁 Project Structure

🎯 Key Features

⚖️ License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages