A powerful, privacy-focused OCR (Optical Character Recognition) web application that runs entirely in your browser. No server uploads, no data tracking - your documents stay on your device.
- 100% Client-Side Processing: All OCR processing happens in your browser
- Multi-Language Support: Supports 100+ languages including English, Chinese, Japanese, Korean, Tamil, Hindi, Arabic, and more
- High Performance: Uses ONNX Runtime with WebAssembly for fast processing
- Multiple Input Methods:
- Drag & drop files
- File selection
- Camera capture (mobile)
- URL input
- Clipboard paste
- File Format Support:
- Images: JPG, PNG, WebP, GIF, BMP, TIFF
- Documents: PDF (with text layer detection)
- Progressive Web App (PWA): Install and use offline
- Image Preprocessing:
- Auto-enhancement detection
- Grayscale conversion
- Noise reduction
- Contrast adjustment
- Image sharpening
- Auto-deskew
- Background removal
- Performance Monitoring:
- Real-time processing metrics
- Stage-by-stage progress tracking
- Resource usage monitoring
- Export Options:
- Copy to clipboard
- Export as TXT
- Export as JSON (with coordinates)
- Dark Mode Support
- Responsive Design: Works on desktop, tablet, and mobile
npm install client-side-ocryarn add client-side-ocrpnpm add client-side-ocr<!-- Add to your HTML -->
<script type="module">
import { RapidOCREngine } from 'https://unpkg.com/client-side-ocr@latest/dist/index.js';
</script>import { RapidOCREngine } from 'client-side-ocr';
// Initialize the OCR engine
const ocr = new RapidOCREngine({
lang: 'en', // Language code
version: 'PP-OCRv4', // Model version
modelType: 'mobile' // 'mobile' for speed, 'server' for accuracy
});
// Initialize models (one-time setup)
await ocr.initialize();
// Process an image
const imageElement = document.getElementById('myImage');
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
canvas.width = imageElement.width;
canvas.height = imageElement.height;
ctx.drawImage(imageElement, 0, 0);
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
const results = await ocr.process(
imageData.data,
imageData.width,
imageData.height
);
// Results contain text and bounding boxes
results.forEach(result => {
console.log('Text:', result.text);
console.log('Confidence:', result.confidence);
console.log('Bounding box:', result.box);
});import { ImagePreprocessor } from 'client-side-ocr';
// Auto-preprocess for best results
const { processed, appliedOptions } = await ImagePreprocessor.autoPreprocess(imageData);
// Or manually configure preprocessing
const preprocessed = await ImagePreprocessor.preprocess(imageData, {
grayscale: true,
denoise: true,
contrast: true,
contrastAlpha: 1.5,
sharpen: true,
deskew: true,
threshold: true,
thresholdValue: 127
});// Get available languages
const languages = RapidOCREngine.getAvailableLanguages();
// Check if a language is supported
if (RapidOCREngine.isLanguageSupported('ta', 'PP-OCRv4')) {
const tamilOCR = new RapidOCREngine({ lang: 'ta' });
}
// Multi-language text (Chinese example)
const chineseOCR = new RapidOCREngine({
lang: 'ch',
modelType: 'server' // Better for complex scripts
});import { pdfProcessor } from 'client-side-ocr';
// Load and process PDF
const pdfFile = document.getElementById('pdfInput').files[0];
const { pages, pdf } = await pdfProcessor.extractTextFromPDF(pdfFile, {
scale: 2.0, // Higher scale for better OCR
maxPages: 10, // Limit pages
onProgress: (progress) => {
console.log(`Processing page ${progress.currentPage}/${progress.totalPages}`);
}
});
// Check if PDF has selectable text
const hasText = await pdfProcessor.hasSelectableText(pdf);
if (hasText) {
const nativeText = await pdfProcessor.extractNativeText(pdf);
console.log('Extracted text:', nativeText);
} else {
// Process with OCR
for (const page of pages) {
const results = await ocr.process(
page.imageData.data,
page.width,
page.height
);
}
}
// Clean up
pdfProcessor.destroy(pdf);// Set progress callback
ocr.setProgressCallback((progress) => {
console.log(`${progress.stage}: ${Math.round(progress.progress * 100)}%`);
});
// Monitor download progress for models
ocr.setDownloadProgressCallback((progress) => {
console.log(`Downloading ${progress.file}: ${progress.progress}%`);
});
// Check if models are cached
const modelsAvailable = await ocr.areModelsAvailable();
if (!modelsAvailable) {
await ocr.downloadModels();
}interface OCREngineOptions {
lang?: LangType; // Language code (default: 'en')
version?: OCRVersion; // Model version (default: 'PP-OCRv4')
modelType?: ModelType; // 'mobile' | 'server' (default: 'mobile')
config?: Partial<OCRConfig>; // Advanced configuration
modelBasePath?: string; // Custom model path
enableWordBoxes?: boolean; // Enable word-level boxes
}initialize(): Promise<void>- Initialize the OCR engineprocess(imageData: Uint8ClampedArray, width: number, height: number): Promise<OCRResult[]>- Process an imagesetProgressCallback(callback: (progress: OCRProgress) => void): void- Set progress callbacksetDownloadProgressCallback(callback: (progress: DownloadProgress) => void): void- Set download progress callbackareModelsAvailable(): Promise<boolean>- Check if models are cacheddownloadModels(): Promise<void>- Download models if not cacheddispose(): void- Clean up resources
preprocess(imageData: ImageData, options: PreprocessingOptions): Promise<ImageData>- Apply preprocessingautoPreprocess(imageData: ImageData): Promise<{ processed: ImageData, appliedOptions: PreprocessingOptions }>- Auto-detect best preprocessing
interface PreprocessingOptions {
grayscale?: boolean; // Convert to grayscale
threshold?: boolean; // Apply binary threshold
thresholdValue?: number; // Threshold value (0-255)
denoise?: boolean; // Remove noise
denoiseStrength?: number; // Noise reduction strength
contrast?: boolean; // Enhance contrast
contrastAlpha?: number; // Contrast factor
contrastBeta?: number; // Brightness adjustment
sharpen?: boolean; // Sharpen image
deskew?: boolean; // Auto-straighten text
removeBackground?: boolean; // Simple background removal
scale?: number; // Image scale factor
}Create a .env file in your project root:
# Base URL for deployment
VITE_BASE_URL=/client-ocr/
# Model CDN (optional, defaults to jsDelivr)
VITE_MODEL_CDN=https://cdn.jsdelivr.net/npm/
# Enable debug mode
VITE_DEBUG=falseLanguages are configured in src/core/language-models.ts. Each language includes:
{
name: string; // Display name
nativeName: string; // Native script name
direction: 'ltr'|'rtl'; // Text direction
models: {
det: {...}, // Detection models
rec: {...}, // Recognition models
cls: {...} // Classification models
}
}- Node.js 18+
- npm/yarn/pnpm
# Clone the repository
git clone https://github.com/yourusername/client-side-ocr.git
cd client-side-ocr
# Install dependencies
npm install
# Start development server
npm run dev
# Build for production
npm run build
# Preview production build
npm run previewclient-side-ocr/
├── src/
│ ├── core/ # Core OCR functionality
│ │ ├── rapid-ocr-engine.ts
│ │ ├── language-models.ts
│ │ ├── preprocessing/
│ │ └── postprocessing/
│ ├── workers/ # Web Workers for processing
│ ├── ui/ # React components
│ └── types/ # TypeScript types
├── public/ # Static assets
└── dist/ # Build output
# Run tests
npm test
# Run tests with coverage
npm run test:coverage
# Run e2e tests
npm run test:e2e# Build and deploy to GitHub Pages
npm run build
npm run deploy- Build the project:
npm run build- Serve the
distfolder with any static file server:
npx serve dist -p 3000FROM nginx:alpine
COPY dist /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]Build and run:
docker build -t client-ocr .
docker run -p 8080:80 client-ocr-
Use appropriate model type:
mobile: Faster, good for real-time processingserver: More accurate, better for complex documents
-
Preprocess images:
- Use auto-preprocessing for best results
- Enable specific options based on your input
-
Optimize for your use case:
- For mobile: Use lower resolution images
- For accuracy: Use higher resolution and server models
-
Cache models:
- Models are cached in IndexedDB after first download
- Check
areModelsAvailable()before processing
- Chrome 90+
- Firefox 88+
- Safari 15.4+
- Edge 90+
WebAssembly and Web Workers are required.
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- RapidOCR for the OCR models
- ONNX Runtime Web for inference
- OpenCV.js for image processing
- PDF.js for PDF handling
- 📧 Email: [email protected]
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
- 📖 Docs: Documentation