OnePen

OnePen Logo

OnePen

AI-Powered Handwriting App with Real-Time Gesture Recognition

Demo β€’ Features β€’ Structure β€’ Architecture β€’ Get Started


The Problem

Handwritten note-taking apps force constant interruptionsβ€”switching tools, tapping buttons, navigating menus. Each micro-interaction breaks focus and slows you down.

The Solution

OnePen eliminates toolbar dependency through real-time gesture recognition. Draw naturally, and ML interprets your intentβ€”grouping content, applying styles, or triggering app features instantly.

Box Curly Delete Underline Box Shortcut Curly Shortcut Circle Shortcut

6 gesture modifiers recognized in real-time β†’ instant formatting & actions

One pen. Zero interruptions. ~30% faster note-taking.


Demo

https://github.com/user-attachments/assets/a4335d94-51ff-4345-89c7-b58fddd72268

Try Live App β†’ Β· Works offline after first load


Project Structure

OnePen/
β”œβ”€β”€ app/                        # Progressive Web App (Frontend)
β”‚   β”œβ”€β”€ index.html             # Main interface
β”‚   β”œβ”€β”€ main.js                # Core app logic & gesture handling
β”‚   β”œβ”€β”€ draw.js                # Canvas rendering engine
β”‚   β”œβ”€β”€ predict.js             # TensorFlow.js inference
β”‚   β”œβ”€β”€ saveNote.js            # IndexedDB persistence
β”‚   β”œβ”€β”€ signin.js              # Google Drive authentication
β”‚   β”œβ”€β”€ feedbackCollector.js   # Implicit feedback for data flywheel
β”‚   β”œβ”€β”€ config.js              # App configuration
β”‚   β”œβ”€β”€ sw.js                  # Service worker (offline support)
β”‚   β”œβ”€β”€ tfjs/                  # Deployed TF.js model
β”‚   └── icons/                 # PWA icons
β”‚
β”œβ”€β”€ src/modifiers/             # ML Training Pipeline
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ architecture.py    # Hybrid CNN + geometric model
β”‚   β”‚   └── trainer.py         # Training loop with callbacks
β”‚   β”œβ”€β”€ features/
β”‚   β”‚   └── geometric.py       # 12D feature extraction
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   β”œβ”€β”€ loader.py          # Dataset loading
β”‚   β”‚   β”œβ”€β”€ augmenter.py       # Data augmentation
β”‚   β”‚   └── preprocessor.py    # Image preprocessing
β”‚   └── utils/                 # Logging, config utilities
β”‚
β”œβ”€β”€ scripts/                   # CLI Tools
β”‚   β”œβ”€β”€ train.py              # Training with W&B tracking
β”‚   β”œβ”€β”€ export.py             # TF.js model conversion
β”‚   └── dataset.py            # Data preprocessing pipeline
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/                  # Raw stroke data by contributor
β”‚   └── processed/            # Preprocessed training data
β”‚
β”œβ”€β”€ models/tfjs/              # Exported browser-ready model
β”œβ”€β”€ config/                   # Training hyperparameters (YAML)
β”œβ”€β”€ tests/                    # pytest test suite
└── assets/                   # Documentation images

Features

### Gesture Recognition - **7 gesture types** recognized in real-time - **Draw + Hold** opens radial tool menu - 1 delete gestures + 6 raw auto-style getures + 6 gestures + hold x 8 tools = **55 quick actions** - Fully customizable gesture-to-action mapping ### Study Tools - **Tape Flashcards** β€” Cover keywords with decorative tape for active recall; tap to reveal. Each tape becomes a reviewable flashcard. - **Auto-Summaries** β€” Generate study sheets from highlights - **Table of Contents** β€” Auto-built from headings - **Reminders** β€” Time-based notifications
### Smart Notes - **Sticky Notes** β€” Floating annotations with mini canvas - **Embed Links** β€” Clickable web previews - **Math Solver** β€” Handwritten β†’ LaTeX β†’ solved ### Export & Sync - **Google Drive** auto-backup - **PDF Export** with full fidelity - **Offline-first** PWA architecture - **Cross-device** via portable JSON

Architecture

System Overview

System Architecture

Model Architecture

Hybrid CNN + Geometric Features β€” Combining visual and numerical inputs for robust gesture classification.

        Image Input (96Γ—96)              Geometric Features (12D)
              β”‚                                   β”‚
              β–Ό                                   β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   MobileNetV3     β”‚               β”‚   Dense 128β†’64  β”‚
    β”‚  + SE Attention   β”‚               β”‚  + LayerNorm    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚                                   β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Fusion Layer   β”‚
                    β”‚   384 β†’ 192     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   8 Classes     β”‚
                    β”‚   (softmax)     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Sample Strokes by Class

Why hybrid? Image-only models confused similar gestures (box vs bracket). Adding geometric features improved accuracy by 5-8%.

Geometric Features

Feature Purpose
Closure ratio Detects closed loops (boxes)
Aspect ratio Distinguishes wide vs tall strokes
Path length Identifies wavy/complex strokes
Verticality Separates diagonal deletes from horizontal underlines
Point density Distinguishes quick strokes from deliberate ones
+ 7 more Fine-grained disambiguation

Performance

Metric Value
Accuracy 99.89%
Inference ~20ms
Model Size 2.5 MB
Classes 8 gesture types

Getting Started

Prerequisites

Installation

# Clone repository
git clone https://github.com/AndyHuynh24/OnePen.git
cd OnePen

# Setup Python environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# Run the app
cd app && python -m http.server 8000
# Open http://localhost:8000

Train Your Own Model

# Preprocess data
python scripts/dataset.py

# Train (with W&B tracking)
python scripts/train.py --epochs 200 --backbone mobilenetv3_small

# Export to TensorFlow.js
python scripts/export.py --model outputs/run_*/stroke_classifier.keras --quantize uint16

Tech Stack

Layer Technologies
Frontend JavaScript, HTML5 Canvas API, TensorFlow.js, IndexedDB, Web Workers
ML/AI TensorFlow, Keras, MobileNetV3, Squeeze-and-Excitation Networks, Weights & Biases
Infrastructure Progressive Web App (PWA), Service Workers, Firebase Hosting, Google Drive API
Quality pytest, ruff, mypy, GitHub Actions CI/CD

Key Learnings (This Project)


License

MIT


Built by Andy Huynh