Document to Markdown

Convert documents and images to clean markdown

Name: Document to Markdown
Brand: Iteration Layer
Availability: OnlineOnly

Send any of 40+ file formats — PDF, Office, EPUB, LaTeX, email, images, and more — get clean markdown back. No schema required. The same OCR pipeline that powers Document Extraction, exposed as a standalone API.

Get Your Free API Key

No credit card required — start with free trial credits

Zero data retention · GDPR Made & hosted in the EU $60 free trial credits No credit card required 14-day money-back guarantee

One output feeds the next

Document to Markdown is part of a complete content pipeline. One key, one credit pool, and structured JSON responses designed to chain together.

Document Extraction

Document to Markdown

Coming Soon

Audio Extraction Coming Soon

Audio Extraction

Coming Soon

Website Extraction Coming Soon

Website Extraction

Audio Extraction Coming Soon

Audio Extraction

Coming Soon

Website Extraction Coming Soon

Fits into your existing stack

Native SDKs for Node, Python, and Go. OpenAPI spec for everything else. MCP server for AI agents and Claude Code skills. n8n community node for visual workflows.

Mix and match freely

Extract data from a document, generate visuals from the results, then compile everything into a finished report. Mix, match, and build your own pipeline.

Three steps to your first conversion

Send your document

Upload any document via URL or base64 — PDF, Office, EPUB, LaTeX, email, images, and more. Any supported format works in the same endpoint.

40+ formats: PDF, Office, EPUB, LaTeX, EML, Jupyter, images
Base64 or URL file input
Single endpoint for all formats

We parse, OCR, and describe

The document is parsed, scanned pages are run through OCR, and tables are extracted. Image files also receive a natural language description of their visual content.

Automatic OCR for scanned pages and photos
Tables converted to markdown table syntax
Image files get a visual content description

Get clean markdown

Receive a JSON result with the file name, MIME type, and extracted markdown. Image files also include a plain-language description field.

Clean markdown preserving document structure
Suitable for LLM pipelines and RAG ingestion
Image description for visual content

Intelligent Parsing

The API automatically selects the best parsing approach for your document. Dense tables, multi-column layouts, and mixed content are handled without any configuration.

Clean Markdown Output

Headings, paragraphs, tables, and lists are preserved as clean markdown syntax. The output is ready to display, embed in a knowledge base, or pass to an LLM — no post-processing needed.

Deep Content Understanding

Images and scanned documents aren't treated as pixel grids to OCR. The API understands what they depict — product photos, charts, diagrams — and returns a natural language description alongside the extracted text.

Built-In OCR

Scanned PDFs and image files are automatically run through OCR. You get readable markdown regardless of whether the source is text or pixels.

All Document Formats

40+ file formats — PDF, DOCX, PPTX, ODT, ODS, XLSX, EPUB, LaTeX, EML, Jupyter, images, and more — all handled by the same endpoint. No format-specific setup or pre-processing required.

No Model Training

Your documents are never used to train or improve AI models. This is guaranteed for all plans — not gated behind an enterprise contract.

Real-world pipelines, ready to ship

Each recipe chains multiple APIs into a complete workflow. Pick one, tweak it, and deploy — or use it as a starting point for your own pipeline.

Convert Contract to Markdown

Convert a contract PDF to clean markdown for clause extraction or LLM analysis.

Convert Document for Knowledge Base

Convert external documents — specs, contracts, reports — to markdown for knowledge base ingestion.

Convert Document for RAG Ingestion

Convert a document to clean markdown suitable for chunking and embedding in a RAG pipeline.

Convert Invoice to Markdown

Convert a PDF invoice to clean markdown for LLM processing or document pipelines.

Convert Resume to Markdown

Convert a resume PDF to clean markdown for LLM parsing or candidate pipelines.

Convert Document to Markdown

Convert PDF, DOCX, HTML, or image documents to clean, structured Markdown.

Preprocess Document for LLM Classification

Convert a document to markdown and classify it with an LLM in a single pipeline.

Browse all recipes

One n8n node for your entire pipeline

Most n8n document workflows chain three or four separate services. The Iteration Layer community node covers extraction, transformation, and generation in a single install — wire up multi-step pipelines visually instead of writing glue code.

n8n Community Node Read the Guide

Start building right now

One API call, one credit deducted. Chains naturally with our other APIs — pipe the output of one into the next without glue code. You'll be up and running in minutes.

Full OpenAPI 3.1 specification available for code generation and IDE integration.
MCP server support for seamless integration with AI agents and tools.
Comprehensive documentation with examples for every field type and edge case.

Read the docs

curl TypeScript Python Go

Request

curl -X POST \
  https://api.iterationlayer.com/document-to-markdown/v1/convert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "file": {
    "type": "url",
    "name": "invoice.pdf",
    "url": "https://example.com/invoice.pdf"
  }
}'

Response

{
  "success": true,
  "data": {
    "name": "invoice.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
  }
}

Request

import { IterationLayer } from "iterationlayer";

const client = new IterationLayer({
  apiKey: "YOUR_API_KEY",
});

const result = await client.convertToMarkdown({
  file: {
    type: "url",
    name: "invoice.pdf",
    url: "https://example.com/invoice.pdf",
  },
});

Response

{
  "success": true,
  "data": {
    "name": "invoice.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
  }
}

Request

from iterationlayer import IterationLayer

client = IterationLayer(
    api_key="YOUR_API_KEY"
)

result = client.convert_to_markdown(
    file={
        "type": "url",
        "name": "invoice.pdf",
        "url": "https://example.com/invoice.pdf",
    }
)

Response

{
  "success": true,
  "data": {
    "name": "invoice.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
  }
}

Request

import il "github.com/iterationlayer/sdk-go"

client := il.NewClient("YOUR_API_KEY")

result, err := client.ConvertToMarkdown(il.ConvertRequest{
  File: il.NewFileFromURL(
    "invoice.pdf",
    "https://example.com/invoice.pdf",
  ),
})

Response

{
  "success": true,
  "data": {
    "name": "invoice.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
  }
}

Official SDKs for every major language

Install the SDK, set your API key, and start chaining requests. Full type safety, automatic retries, and idiomatic error handling included.

TypeScript Python Go

Your data stays in the EU

Your data is processed on EU servers and never stored beyond temporary logs. Zero retention, GDPR-compliant by design, with a Data Processing Agreement available for every customer. Learn more about our security practices .

No data storage, no model training

We don't store your files or processing results, and your data is never used to train or improve AI models. Logs are automatically deleted after 90 days.

EU-hosted infrastructure

All processing runs on servers located in the European Union. Your data never leaves the EU.

GDPR-compliant by design

Full compliance with EU data protection regulations. Data Processing Agreement available for all customers.

Pricing

Start with free trial credits. No credit card required.

Developer

For individuals & small projects

$29.99 /month

1,000 credits included

Get Your Free API Key

Startup

Save 40%

For growing teams

$119.99 /month

5,000 credits included

Get Your Free API Key

Business

Save 47%

For high-volume workloads

$319.99 /month

15,000 credits included

Get Your Free API Key

Or pay as you go from $0.022/credit with automatic volume discounts.

All APIs included Free trial credits per API Project-based budget caps Auto overage billing

See full pricing

Frequently asked questions

What file formats are supported?

The API accepts 40+ file formats including PDF, DOCX, PPTX, ODT, ODS, XLSX, EPUB, CSV, TSV, HTML, LaTeX, EML, Jupyter notebooks, and all common image formats. Scanned documents are processed with built-in OCR.

What is the difference between this and Document Extraction?

Document to Markdown runs only the ingestion step — it converts files to clean markdown. Document Extraction builds on this by also applying a schema to extract specific fields as structured JSON. Use Document to Markdown when you want the content itself; use Document Extraction when you want specific named values.

Why does the markdown include an image description?

For image files, the API runs both OCR (to extract any text) and a vision model (to describe the visual content). The description is returned as a separate field so you can use it in your own downstream processing.

How many files can I send per request?

Up to 20 files per request. Each file gets its own result in the response array. The order of results matches the order of the input files.

Is the output suitable for LLMs?

Yes. The markdown format is the same used internally by the Document Extraction API as input to LLM extraction. Tables, structure, and content are preserved in a way that models read reliably.

Still evaluating?

See how we compare — and where the competition still wins. Choosing the right tool shouldn't require a week of research.

Built for how you work

Whether you're building pipelines in code, automating workflows, orchestrating AI agents, or shipping client projects — Iteration Layer fits your process.

Convert documents and images to clean markdown

One output feeds the next

Fits into your existing stack

Mix and match freely

Three steps to your first conversion

Send your document

We parse, OCR, and describe

Get clean markdown

Intelligent Parsing

Clean Markdown Output

Deep Content Understanding

Built-In OCR

All Document Formats

No Model Training

Real-world pipelines, ready to ship

Convert Contract to Markdown

Convert Document for Knowledge Base

Convert Document for RAG Ingestion

Convert Invoice to Markdown

Convert Resume to Markdown

Convert Document to Markdown

Preprocess Document for LLM Classification

One n8n node for your entire pipeline

Start building right now

Official SDKs for every major language

Your data stays in the EU

No data storage, no model training

EU-hosted infrastructure

GDPR-compliant by design

Pricing

Developer

Startup

Business

Frequently asked questions

Still evaluating?

Reducto

LlamaParse

Mistral OCR

Nanonets

DocuPipe

Unstructured

AWS Textract

Azure Document Intelligence

Google Document AI

OlmOCR

PaddleOCR

Tesseract

Built for how you work

Developers

Operations Teams

AI Agents

Agencies