A high-performance invoice processing and renaming system that uses GPU-accelerated OCR with Ollama for intelligent document analysis.
- GPU-Accelerated Processing: Utilizes NVIDIA GPUs for fast invoice processing
- Intelligent OCR: Combines PyPDF2, Tesseract, and Ollama LLM for accurate text extraction
- Automated Renaming: Generates structured filenames based on accounting codes
- Batch Processing: Process multiple PDF files in folders automatically
- REST API Endpoints: HTTP API for integration with web applications and external systems
- Multi-Language Support: Handles German and English invoices
- No Cloud Dependencies: Runs entirely on local infrastructure
main.py βββΊ GPUAwareInvoiceOCRProcessor βββΊ Ollama (GPU) βββΊ Structured Data
β β
ββββΊ AccountingManager βββΊ Filename Generation βββΊ File Copy βββ
sudo apt-get install tesseract-ocr tesseract-ocr-eng tesseract-ocr-deu poppler-utilscurl -fsSL https://ollama.ai/install.sh | shpip install -r requirements.txt-
Clone and Setup Environment:
git clone <repository> cd invoice_renaming python -m venv venv source venv/bin/activate pip install -r requirements.txt
-
Install System Dependencies:
sudo apt-get install tesseract-ocr tesseract-ocr-eng tesseract-ocr-deu poppler-utils
-
Setup Ollama with GPU:
# Install Ollama curl -fsSL https://ollama.ai/install.sh | sh # Stop system service and start with GPU sudo systemctl stop ollama CUDA_VISIBLE_DEVICES=0 ollama serve & # Pull required model ollama pull llama3.1:8b
-
Prepare Input Files:
- Place PDF invoices in
input/directory - Ensure
input/accounting_codes.csvcontains your accounting codes
- Place PDF invoices in
python main.py --prompt "Process invoice" --pdf "input/your_invoice.pdf"# Make script executable
chmod +x process_folder.sh
# Process all PDFs in input/digital/ folder
./process_folder.sh
# Process specific folder
./process_folder.sh input/mobile_photos# Start the Flask server
python simple_flask_server.py
# Test the API (in another terminal)
curl -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{"prompt": "Process invoice input/pdffile11101.pdf"}'python main.py --prompt "Process the invoice input/pdffile11101.pdf and rename it according to our accounting system"python main.py --prompt "Process invoice" --pdf "input/digital/invoice.pdf"python main.py --prompt "Process invoice" --pdf "input/invoice.pdf" --codes "custom_codes.csv"# Process all PDFs in input/digital/ folder (default)
./process_folder.sh
# Process PDFs in specific folder
./process_folder.sh input/mobile_photos
# Process PDFs in any custom folder
./process_folder.sh /path/to/your/invoices# Run in background with logging
./process_folder.sh input/digital > logs/batch_$(date +%Y%m%d_%H%M).log 2>&1 &
# Monitor progress
tail -f logs/batch_*.log
# Check completion status
ps aux | grep process_folder- Small Batch (10 files): 2-5 minutes
- Medium Batch (50 files): 10-25 minutes
- Large Batch (100+ files): 20-50 minutes
- Success Rate: 95%+ for digital PDFs
- GPU Memory Usage: 5-9GB during processing
π Batch Invoice Processing
==========================
π Found 37 PDF files in input/digital
π Starting batch processing...
π Processing: invoice1.pdf
β
Success: invoice1.pdf
π Processing: invoice2.pdf
β
Success: invoice2.pdf
---
π Batch Processing Complete!
β
Processed: 35 files
β Failed: 2 files
π Check output/ folder for resultsThe system provides REST API endpoints for integration with web applications and external systems.
# Start the simplified server (bypasses AutoGen complexity)
python simple_flask_server.py# Start the full server with AutoGen agents
python src/invoice_team/flask_server.py# Check if server is running
curl -X GET http://localhost:5001/healthResponse:
{
"status": "healthy",
"service": "invoice-processing"
}# Basic invoice processing
curl -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{
"prompt": "Process invoice input/pdffile11101.pdf"
}'Response:
{
"success": true,
"original_file": "pdffile11101.pdf",
"new_filename": "20250309 OND295 Microsoft Osterreich GmbH - Billing Invoice Nr. G081952054.pdf",
"output_path": "output/20250309 OND295 Microsoft Osterreich GmbH - Billing Invoice Nr. G081952054.pdf",
"invoice_data": {
"date": "2025-03-09",
"number": "G081952054",
"description": "Billing for the period 01022025 - 280220",
"amount": "β¬1,234.56",
"currency": "EUR",
"issuer": "Microsoft Osterreich GmbH",
"recipient": "Your Company"
},
"accounting_code": "OND295",
"gpu_accelerated": true,
"processing_time": 8.5
}curl -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{
"prompt": "Process invoice input/digital/company_invoice.pdf"
}'curl -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{
"prompt": "Process invoice input/invoice.pdf using custom_codes.csv"
}'# Test with the default sample file
curl -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{
"prompt": "Read and analyze the PDF input/pdffile11101.pdf and rename it according to our accounting system"
}' | jq '.'# Create a simple test script
cat > test_api.sh << 'EOF'
#!/bin/bash
echo "π§ͺ Testing Invoice Processing API"
echo "================================"
# Test health endpoint
echo "π Health Check:"
curl -s http://localhost:5001/health | jq '.'
echo -e "\nπ Processing Sample Invoice:"
curl -s -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{"prompt": "Process invoice input/pdffile11101.pdf"}' | jq '.'
EOF
chmod +x test_api.sh
./test_api.shcurl -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{}'Response:
{
"error": "Missing prompt in request body"
}curl -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{
"prompt": "Process invoice input/nonexistent.pdf"
}'Response:
{
"error": "PDF file not found: input/nonexistent.pdf"
}# Watch server output in real-time
tail -f logs/flask_server.log
# Monitor GPU usage during API calls
watch -n 1 nvidia-smi# Simple load test with multiple concurrent requests
for i in {1..5}; do
curl -X POST http://localhost:5001/invoice \
-H "Content-Type: application/json" \
-d '{"prompt": "Process invoice input/pdffile11101.pdf"}' &
done
waitimport requests
import json
def process_invoice_api(pdf_path):
url = "http://localhost:5001/invoice"
payload = {"prompt": f"Process invoice {pdf_path}"}
response = requests.post(url, json=payload)
if response.status_code == 200:
return response.json()
else:
return {"error": response.text}
# Usage
result = process_invoice_api("input/my_invoice.pdf")
print(json.dumps(result, indent=2))const axios = require('axios');
async function processInvoice(pdfPath) {
try {
const response = await axios.post('http://localhost:5001/invoice', {
prompt: `Process invoice ${pdfPath}`
});
return response.data;
} catch (error) {
return { error: error.message };
}
}
// Usage
processInvoice('input/my_invoice.pdf')
.then(result => console.log(JSON.stringify(result, null, 2)));- GPU Acceleration: ~5-7GB GPU memory usage
- Processing Time: 6-28 seconds per invoice (depending on complexity)
- Accuracy: High accuracy with structured data extraction
- Supported Formats: PDF (digital and scanned)
invoice_renaming/
βββ main.py # Main entry point
βββ process_folder.sh # Batch processing script
βββ gpu_aware_ocr_solution.py # GPU-aware OCR processor
βββ mcp_filesystem_server.py # Accounting code management
βββ mcp_invoice_extraction_server.py # Invoice extraction server
βββ input/ # Input invoices
β βββ accounting_codes.csv # Accounting codes
β βββ digital/ # Digital PDF invoices
β βββ mobile_photos/ # Scanned/photographed invoices
βββ output/ # Processed and renamed invoices
βββ logs/ # Application logs
βββ requirements.txt # Python dependencies
Invoices are renamed using the format:
YYYYMMDD ACCOUNTING_CODE COMPANY - Description Invoice Nr. [Number].pdf
Example:
20250309 OND295 Microsoft Osterreich GmbH - Billing for the period 01022025 - 280220 Invoice Nr. G081952054.pdf
The system extracts:
- Date: Invoice date
- Number: Invoice number
- Description: Invoice description/purpose
- Amount: Total amount
- Currency: Currency code
- Issuer: Company issuing the invoice
- Recipient: Company receiving the invoice
- VAT Numbers: Tax identification numbers
- Recommended: NVIDIA GPU with 8GB+ VRAM
- Tested: NVIDIA A100-SXM4-40GB
- Fallback: CPU processing (slower but functional)
Comprehensive logging includes:
- GPU detection and utilization
- Processing times and performance metrics
- Error handling and debugging information
- File operations and transformations
# Check GPU availability
nvidia-smi
# Restart Ollama with GPU
sudo systemctl stop ollama
CUDA_VISIBLE_DEVICES=0 ollama serve &# Pull required model
ollama pull llama3.1:8b
# List available models
ollama list# Fix file permissions
chmod +x main.py
sudo chown -R $USER:$USER invoice_renaming/- REST API interface
- Docker containerization
- Batch processing capabilities
- Web UI for invoice management
- Additional language support
- Cloud deployment options
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
- Ollama for local LLM inference
- Tesseract OCR for text extraction
- PyPDF2 for PDF processing
- NVIDIA for GPU acceleration capabilities