Iteration Layer vs Unstructured
Unstructured is a data pipeline platform for RAG ingestion — great at ETL, but a different tool than a document-to-markdown API.
No credit card required — start with free trial credits
Why developers switch from Unstructured
Unstructured is built for ETL pipelines and RAG ingestion — not a simple document-to-markdown API.
Single API call, not a pipeline
Unstructured requires configuring sources, destinations, workflows, and connectors to process documents. We return clean markdown from a single POST request — no pipeline setup, no infrastructure.
Image description field
When you convert an image file, we return both the OCR-extracted markdown and a natural language description of what the image shows. Unstructured returns chunked elements without semantic image understanding.
EU hosting with GDPR compliance
Unstructured is US-based with no EU-only hosting option. We process all documents on EU servers with zero data retention and a Data Processing Agreement available for every customer.
Feature-by-feature comparison
We went through the docs so you don't have to. Here's how every feature compares — including the ones where we're not the better choice.
| Feature | Iteration Layer | Unstructured |
|---|---|---|
| Markdown output |
Clean markdown
Returns well-structured markdown with preserved headings, tables, and lists from any document |
Element-based
Returns structured JSON elements that can be reassembled into markdown — designed for chunking, not direct markdown output |
| Image description |
Yes
Returns a natural language description of image content alongside OCR markdown for image files |
No
Text extraction only — no semantic description of visual image content |
| API simplicity |
Single endpoint
One POST request with a file returns markdown — no configuration, workflows, or connectors needed |
Pipeline setup
Requires configuring sources, destinations, and workflows before processing documents |
| RAG chunking |
Plain markdown
Returns clean markdown — chunk it yourself or pass it directly to your pipeline |
Built-in
Multiple chunking strategies optimized for vector database ingestion with configurable chunk sizes |
| Source/destination connectors |
API only
Standalone REST API — bring your own storage and pipeline |
30+ connectors
Direct integrations with S3, Google Drive, SharePoint, Pinecone, Weaviate, and more |
| Table preservation |
Markdown tables
Tables are extracted and rendered as clean markdown table syntax |
Element-based
Tables extracted as structured elements with HTML table support for complex layouts |
| Supported input formats |
40+ formats
Process 40+ formats — PDF, Office, EPUB, RTF, LaTeX, email, Jupyter, images, and more — in a single API endpoint |
64+ formats
Supports 64+ file types including email, code files, and specialized formats |
| MCP server |
Yes
MCP server available for integration with AI agents and assistants |
Yes
MCP server available for workflow and partition operations |
| EU hosting |
EU only
All processing happens exclusively on EU-hosted servers |
US-based
Processing infrastructure is US-based with no EU-only hosting option |
| Pricing model |
Per page
Simple, predictable per-page pricing with credit-based plans |
Per page
Free tier with 15,000 pages, then $0.03 per page pay-as-you-go |
| Infrastructure required |
None
Fully managed API with no deployment or infrastructure to manage |
None
Fully managed cloud platform with optional dedicated instances |
| GDPR / Data privacy |
Zero retention
No files or results stored beyond temporary 90-day logs |
SOC 2 / HIPAA
SOC 2 Type 2 and HIPAA certified, GDPR support available — but US-based processing |
| Data used for training |
Never
Your data is never used to train or improve AI models — guaranteed for all plans |
Not documented
No public policy on whether hosted API customer data is used for model training |
Pricing
Start with free trial credits. No credit card required.
Developer
For individuals & small projects
Startup
Save 40%For growing teams
Business
Save 47%For high-volume workloads
Or pay as you go from $0.022/credit with automatic volume discounts.
Still evaluating?
See how we compare — and where the competition still wins. Choosing the right tool shouldn't require a week of research.
Reducto
Reducto outputs markdown from US servers and charges per page — without an image description field.
LlamaParse
LlamaParse is US-based and per-page — and doesn't describe image content.
Mistral OCR
Mistral has best-in-class OCR and returns markdown, but doesn't describe image content and processes files from US servers.
Nanonets
Nanonets DocStrange outputs markdown, but has no image descriptions and no EU hosting option.
DocuPipe
DocuPipe extracts structured fields from documents — it doesn't produce clean, readable markdown.
AWS Textract
Textract returns raw strings and bounding boxes — not a markdown document ready to read or embed.
Azure Document Intelligence
Azure outputs model-specific field values, not clean markdown — and requires model selection or training first.
Google Document AI
Document AI requires a GCP project, processor selection, and S3-equivalent storage before you get any text out.
OlmOCR
OlmOCR requires a GPU, only supports English, and intentionally strips headers and footers.
PaddleOCR
PaddleOCR outputs markdown, but requires the PaddlePaddle framework and self-hosted infrastructure.
Tesseract
Tesseract outputs raw text — no headings, no tables, no document structure preserved.
Start building in minutes
Free trial credits included. No credit card required.