Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ sudo apt install tesseract-ocr

| Category | Formats |
|---|---|
| PDF & derivatives | PDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT |
| PDF & derivatives | PDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, MD |
| Images | PNG, JPEG, BMP, TIFF, GIF, and more |
| Microsoft Office *(Pro)* | DOC, DOCX, XLS, XLSX, PPT, PPTX |
| Korean Office *(Pro)* | HWP, HWPX |
Expand Down Expand Up @@ -171,6 +171,17 @@ text = page.get_textpage_ocr(language="eng").extractText()
print(text)
```

### Convert Markdown to PDF

```python
import pymupdf

md_doc = pymupdf.open("example.md")
pdfdata = md_doc.convert_to_pdf()
pdf_doc = pymupdf.open(stream=pdfdata)
pdf_doc.save("example.pdf")
```

### Convert to Markdown for LLMs

```python
Expand Down Expand Up @@ -254,6 +265,7 @@ print(md)
| **Annotations** | Read and write highlights, underlines, squiggly lines, sticky notes, free text, ink, stamps |
| **Redaction** | Add and permanently apply redaction annotations |
| **Forms** | Read and fill PDF AcroForm fields |
| **PDF creation** | Create PDFs directly with the API or quickly convert from Markdown files |
| **PDF editing** | Insert, delete, and reorder pages; set metadata; merge and split documents |
| **Drawing** | Draw lines, curves, rectangles, and circles; insert HTML boxes |
| **Encryption** | Open password-protected PDFs; save with RC4 or AES encryption |
Expand Down
Loading