AI-Powered Handwriting App with Real-Time Gesture Recognition
Demo β’ Features β’ Structure β’ Architecture β’ Get Started
Handwritten note-taking apps force constant interruptionsβswitching tools, tapping buttons, navigating menus. Each micro-interaction breaks focus and slows you down.
OnePen eliminates toolbar dependency through real-time gesture recognition. Draw naturally, and ML interprets your intentβgrouping content, applying styles, or triggering app features instantly.
6 gesture modifiers recognized in real-time β instant formatting & actions
One pen. Zero interruptions. ~30% faster note-taking.
https://github.com/user-attachments/assets/a4335d94-51ff-4345-89c7-b58fddd72268
Try Live App β Β· Works offline after first load
OnePen/
βββ app/ # Progressive Web App (Frontend)
β βββ index.html # Main interface
β βββ main.js # Core app logic & gesture handling
β βββ draw.js # Canvas rendering engine
β βββ predict.js # TensorFlow.js inference
β βββ saveNote.js # IndexedDB persistence
β βββ signin.js # Google Drive authentication
β βββ feedbackCollector.js # Implicit feedback for data flywheel
β βββ config.js # App configuration
β βββ sw.js # Service worker (offline support)
β βββ tfjs/ # Deployed TF.js model
β βββ icons/ # PWA icons
β
βββ src/modifiers/ # ML Training Pipeline
β βββ models/
β β βββ architecture.py # Hybrid CNN + geometric model
β β βββ trainer.py # Training loop with callbacks
β βββ features/
β β βββ geometric.py # 12D feature extraction
β βββ data/
β β βββ loader.py # Dataset loading
β β βββ augmenter.py # Data augmentation
β β βββ preprocessor.py # Image preprocessing
β βββ utils/ # Logging, config utilities
β
βββ scripts/ # CLI Tools
β βββ train.py # Training with W&B tracking
β βββ export.py # TF.js model conversion
β βββ dataset.py # Data preprocessing pipeline
β
βββ data/
β βββ raw/ # Raw stroke data by contributor
β βββ processed/ # Preprocessed training data
β
βββ models/tfjs/ # Exported browser-ready model
βββ config/ # Training hyperparameters (YAML)
βββ tests/ # pytest test suite
βββ assets/ # Documentation images
| ### Gesture Recognition - **7 gesture types** recognized in real-time - **Draw + Hold** opens radial tool menu - 1 delete gestures + 6 raw auto-style getures + 6 gestures + hold x 8 tools = **55 quick actions** - Fully customizable gesture-to-action mapping | ### Study Tools - **Tape Flashcards** β Cover keywords with decorative tape for active recall; tap to reveal. Each tape becomes a reviewable flashcard. - **Auto-Summaries** β Generate study sheets from highlights - **Table of Contents** β Auto-built from headings - **Reminders** β Time-based notifications |
| ### Smart Notes - **Sticky Notes** β Floating annotations with mini canvas - **Embed Links** β Clickable web previews - **Math Solver** β Handwritten β LaTeX β solved | ### Export & Sync - **Google Drive** auto-backup - **PDF Export** with full fidelity - **Offline-first** PWA architecture - **Cross-device** via portable JSON |
Hybrid CNN + Geometric Features β Combining visual and numerical inputs for robust gesture classification.
Image Input (96Γ96) Geometric Features (12D)
β β
βΌ βΌ
βββββββββββββββββββββ βββββββββββββββββββ
β MobileNetV3 β β Dense 128β64 β
β + SE Attention β β + LayerNorm β
βββββββββββ¬ββββββββββ ββββββββββ¬βββββββββ
β β
ββββββββββββββββ¬βββββββββββββββββββββ
βΌ
βββββββββββββββββββ
β Fusion Layer β
β 384 β 192 β
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 8 Classes β
β (softmax) β
βββββββββββββββββββ
Why hybrid? Image-only models confused similar gestures (box vs bracket). Adding geometric features improved accuracy by 5-8%.
| Feature | Purpose |
|---|---|
| Closure ratio | Detects closed loops (boxes) |
| Aspect ratio | Distinguishes wide vs tall strokes |
| Path length | Identifies wavy/complex strokes |
| Verticality | Separates diagonal deletes from horizontal underlines |
| Point density | Distinguishes quick strokes from deliberate ones |
| + 7 more | Fine-grained disambiguation |
| Metric | Value |
|---|---|
| Accuracy | 99.89% |
| Inference | ~20ms |
| Model Size | 2.5 MB |
| Classes | 8 gesture types |
# Clone repository
git clone https://github.com/AndyHuynh24/OnePen.git
cd OnePen
# Setup Python environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
# Run the app
cd app && python -m http.server 8000
# Open http://localhost:8000
# Preprocess data
python scripts/dataset.py
# Train (with W&B tracking)
python scripts/train.py --epochs 200 --backbone mobilenetv3_small
# Export to TensorFlow.js
python scripts/export.py --model outputs/run_*/stroke_classifier.keras --quantize uint16
| Layer | Technologies |
|---|---|
| Frontend | JavaScript, HTML5 Canvas API, TensorFlow.js, IndexedDB, Web Workers |
| ML/AI | TensorFlow, Keras, MobileNetV3, Squeeze-and-Excitation Networks, Weights & Biases |
| Infrastructure | Progressive Web App (PWA), Service Workers, Firebase Hosting, Google Drive API |
| Quality | pytest, ruff, mypy, GitHub Actions CI/CD |
MIT
Built by Andy Huynh