π Client-Side OCR v2.0.0
This major release brings PPU PaddleOCR model support, extends language support to 100+ languages, and includes critical fixes for model recognition and performance.
β¨ New Features
- PPU PaddleOCR Model Support: Full support for PPU PaddleOCR models with specialized preprocessing
- Extended Language Support: Expanded from 14 to 100+ languages with comprehensive model coverage
- Stack Overflow Prevention: Safe handling of large documents without memory errors
- Enhanced Documentation:
- Usage Guide with examples
- API Reference
- Troubleshooting Guide
- Model Documentation
- Model Width Limiting: Automatic width limiting for PPU models to prevent memory issues
- Improved Error Handling: Better error messages and recovery strategies
π Bug Fixes
- PPU Model Recognition: Fixed critical issue where PPU models were returning gibberish text instead of correct predictions
- Implemented proper grayscale conversion (red channel only)
- Fixed dictionary indexing (0-based instead of 1-based)
- Stack Overflow Errors: Fixed "Maximum call stack size exceeded" errors when processing large documents
- Replaced spread operators with loops for large arrays
- Made debug output safer by skipping operations on large tensors
- Memory Management: Improved memory handling for large image processing
- TypeScript Compatibility: Fixed Float32Array type issues
π Documentation
- Added comprehensive usage documentation with real-world examples
- Created detailed API reference for all classes and methods
- Documented common problems and their solutions
- Added model architecture and selection guide
π§ Technical Details
- PPU models now use red channel only for grayscale conversion
- PPU models use 0-based dictionary indexing
- Maximum width limited to 800px for PPU models
- Safer array operations throughout the codebase
- Enhanced preprocessing pipeline with model-specific normalization
π¦ Installation
```bash
npm install [email protected]
```
π Quick Start
```typescript
import { createRapidOCREngine } from 'client-side-ocr';
const ocr = createRapidOCREngine({
language: 'en', // or any of 100+ languages
modelVersion: 'PP-OCRv4'
});
await ocr.initialize();
const result = await ocr.processImage(imageFile);
console.log(result.text);
```
π Acknowledgments
Special thanks to the RapidOCR and PaddleOCR teams for their excellent models and to the ppu-paddle-ocr project for TypeScript implementation references.
π Full Changelog
See CHANGELOG.md for detailed changes.