Product
LICENSE
AGPLFree to use under the GNU Affero General Public License. Requires open-sourcing your application if distributed.CommercialPaid license for proprietary or commercial use without AGPL obligations.
AGPLFree to use under the GNU Affero General Public License. Requires open-sourcing your application if distributed.CommercialPaid license for proprietary or commercial use without AGPL obligations.
AGPLFree to use under the GNU Affero General Public License. Requires open-sourcing your application if distributed.CommercialPaid license for proprietary or commercial use without AGPL obligations.
CommercialPyMuPDF Pro is available under a Commercial license only. No AGPL option is available — designed for proprietary and enterprise use.
SOURCE CODE
Open SourceSource code is publicly available and can be inspected, modified, and contributed to.
Open SourceSource code is publicly available and can be inspected, modified, and contributed to.
Open SourceSource code is publicly available and can be inspected, modified, and contributed to.
-Source code is not publicly available.
INPUT FILES
PDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, ImagePDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, and Image formats are natively supported.
DOC/DOCX, XLS/XLSX, PPT/PPTX, HWP/HWPXDOC/DOCX, XLS/XLSX, PPT/PPTX, HWP/HWPX are not supported in this product.
PDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, ImagePDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, and Image formats are natively supported.
DOC/DOCX, XLS/XLSX, PPT/PPTX, HWP/HWPXDOC/DOCX, XLS/XLSX, PPT/PPTX, HWP/HWPX are not supported in this product.
PDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, ImagePDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, and Image formats are natively supported.
DOC/DOCX, XLS/XLSX, PPT/PPTX, HWP/HWPXDOC/DOCX, XLS/XLSX, PPT/PPTX, HWP/HWPX are not supported in this product.
PDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, ImagePDF, XPS, EPUB, CBZ, MOBI, FB2, SVG, TXT, and Image formats are natively supported.
DOC/DOCX, XLS/XLSX, PPT/PPTX, HWP/HWPXDOC/DOCX, XLS/XLSX, PPT/PPTX, HWP/HWPX supported via conversion layer.
OUTPUT FILES
PDF, SVG, ImageGenerates PDF, SVG, and Image formats directly.
Markdown, JSON, TXTMarkdown, JSON, and TXT are supported — ideal for structured or AI-ready output.
PDF, SVG, ImageGenerates PDF, SVG, and Image formats directly.
Markdown, JSON, TXTMarkdown, JSON, and TXT are not available in this product.
PDF, SVG, ImageGenerates PDF, SVG, and Image formats directly.
Markdown, JSON, TXTMarkdown, JSON, and TXT are supported — ideal for structured or AI-ready output.
PDF, SVG, ImageGenerates PDF, SVG, and Image formats directly.
Markdown, JSON, TXTMarkdown, JSON, and TXT available for structured or AI-ready output.
PAGE ANALYSIS
Advanced Page AnalysisUses trained data for enhanced structural recognition and superior layout results.
Basic Page AnalysisReturns document structure including layout and element positions.
Advanced Page AnalysisUses trained data for enhanced structural recognition and superior layout results.
All IncludedIncludes both basic document structure detection and advanced trained-data analysis.
TEXT EXTRACTION
Advanced Text ExtractionExtracts text with structure tags (headings, lists, tables), page layout analysis, and semantic understanding. Includes superior table extraction with full cell structure and data type recognition.
Basic Text ExtractionExtracts text with structured layout information and bounding box data. Includes basic table extraction.
Advanced Text ExtractionExtracts text with structure tags (headings, lists, tables), page layout analysis, and semantic understanding. Includes superior table extraction with full cell structure and data type recognition.
All IncludedIncludes both basic structured extraction and advanced semantic text extraction with superior table extraction.
IMAGE EXTRACTION
Advanced Image ExtractionAdvanced detection and rendering of image areas on the page — saves to disk or embeds in Markdown output.
Basic Image ExtractionExtracts embedded images from PDF pages.
Advanced Image ExtractionAdvanced detection and rendering of image areas on the page — saves to disk or embeds in Markdown output.
All IncludedIncludes both basic image extraction and advanced image area detection and rendering.
VECTOR EXTRACTION
Advanced Vector ExtractionSuperior detection of picture areas with precise vector element identification.
Basic Vector ExtractionExtracts and clusters vector graphics from PDF pages.
Advanced Vector ExtractionSuperior detection of picture areas with precise vector element identification.
All IncludedIncludes both basic vector extraction/clustering and superior picture area detection.
OCR
Automatic OCRAutomatically applies OCR based on page content analysis — no manual trigger needed.
On-demandOn-demand invocation of built-in Tesseract for text detection on pages or images.
Automatic OCRAutomatically applies OCR based on page content analysis — no manual trigger needed.
All IncludedIncludes both on-demand Tesseract invocation and automatic OCR based on page content analysis.