Skip to content

v2.0.0 - PPU Models, 100+ Languages & Major Improvements

Latest

Choose a tag to compare

@siva-sub siva-sub released this 29 Jul 06:29
· 4 commits to master since this release

πŸš€ Client-Side OCR v2.0.0

This major release brings PPU PaddleOCR model support, extends language support to 100+ languages, and includes critical fixes for model recognition and performance.

✨ New Features

  • PPU PaddleOCR Model Support: Full support for PPU PaddleOCR models with specialized preprocessing
  • Extended Language Support: Expanded from 14 to 100+ languages with comprehensive model coverage
  • Stack Overflow Prevention: Safe handling of large documents without memory errors
  • Enhanced Documentation:
  • Model Width Limiting: Automatic width limiting for PPU models to prevent memory issues
  • Improved Error Handling: Better error messages and recovery strategies

πŸ› Bug Fixes

  • PPU Model Recognition: Fixed critical issue where PPU models were returning gibberish text instead of correct predictions
    • Implemented proper grayscale conversion (red channel only)
    • Fixed dictionary indexing (0-based instead of 1-based)
  • Stack Overflow Errors: Fixed "Maximum call stack size exceeded" errors when processing large documents
    • Replaced spread operators with loops for large arrays
    • Made debug output safer by skipping operations on large tensors
  • Memory Management: Improved memory handling for large image processing
  • TypeScript Compatibility: Fixed Float32Array type issues

πŸ“š Documentation

  • Added comprehensive usage documentation with real-world examples
  • Created detailed API reference for all classes and methods
  • Documented common problems and their solutions
  • Added model architecture and selection guide

πŸ”§ Technical Details

  • PPU models now use red channel only for grayscale conversion
  • PPU models use 0-based dictionary indexing
  • Maximum width limited to 800px for PPU models
  • Safer array operations throughout the codebase
  • Enhanced preprocessing pipeline with model-specific normalization

πŸ“¦ Installation

```bash
npm install [email protected]
```

πŸš€ Quick Start

```typescript
import { createRapidOCREngine } from 'client-side-ocr';

const ocr = createRapidOCREngine({
language: 'en', // or any of 100+ languages
modelVersion: 'PP-OCRv4'
});

await ocr.initialize();
const result = await ocr.processImage(imageFile);
console.log(result.text);
```

πŸ™ Acknowledgments

Special thanks to the RapidOCR and PaddleOCR teams for their excellent models and to the ppu-paddle-ocr project for TypeScript implementation references.

πŸ“ Full Changelog

See CHANGELOG.md for detailed changes.