Skip to content

charanpool/user_identification_based_on_keystrokes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

⌨️ Keystroke Dynamics Biometric Authentication

Python 3.10+ Streamlit scikit-learn License: MIT

A modern ML-powered biometric authentication system that identifies users based on their unique typing patterns. Your keystroke dynamicsβ€”how long you hold keys, the rhythm between keystrokesβ€”create a behavioral biometric signature as unique as your fingerprint.

Homepage

✨ Features

  • πŸ”’ Behavioral Biometrics: Authenticate users based on how they type, not what they type
  • πŸ€– ML-Powered: Random Forest classifier with confidence scoring
  • πŸ“Š Real-time Analytics: Visualize typing patterns with interactive Plotly charts
  • 🎨 Modern UI: Sleek dark-themed Streamlit interface
  • ⚑ Fast Training: Quick model training with minimal samples

🧠 How It Works

The system captures and analyzes three key typing characteristics:

flowchart LR
    subgraph capture [Keystroke Capture]
        A[Key Press] --> B[Key Release]
        B --> C[Timing Data]
    end
    
    subgraph features [Feature Extraction]
        D[Dwell Time]
        E[Flight Time]
        F[Digraph Latency]
        G[Trigraph Latency]
    end
    
    subgraph ml [ML Classification]
        H[Random Forest]
        I[User Prediction]
        J[Confidence Score]
    end
    
    C --> D & E & F & G
    D & E & F & G --> H
    H --> I & J
Loading

Keystroke Features

Feature Description Example
Dwell Time How long a key is held down Hold time for 'e' = 85ms
Flight Time Time between releasing one key and pressing the next Gap between 't' and 'h' = 120ms
Digraph Latency Time to type common two-letter sequences 'th' takes 210ms total
Trigraph Latency Time to type common three-letter sequences 'the' takes 350ms total

πŸš€ Quick Start

Prerequisites

  • Python 3.10 or higher
  • pip package manager

Installation

# Clone the repository
git clone https://github.com/yourusername/keystroke-biometrics.git
cd keystroke-biometrics

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Running the Application

streamlit run app.py

The app will open in your browser at http://localhost:8501

πŸ“– Usage

1. Register Users

  1. Navigate to the Register page
  2. Enter a unique username
  3. Type 3 sample paragraphs to train your profile
  4. System learns your unique typing patterns

2. Authenticate

  1. Navigate to the Authenticate page
  2. Type the displayed paragraph naturally
  3. Click "Authenticate" to identify yourself
  4. View confidence score and probability distribution

3. Analytics

  1. Navigate to the Analytics page
  2. Select a user to view their typing profile
  3. Explore dwell times, digraph patterns, and feature importance

πŸ—οΈ Project Structure

keystroke-biometrics/
β”œβ”€β”€ app.py                 # Streamlit main application
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ capture.py         # Keystroke timing capture
β”‚   β”œβ”€β”€ features.py        # Feature extraction engine
β”‚   β”œβ”€β”€ model.py           # ML model (Random Forest)
β”‚   └── utils.py           # Helper functions
β”œβ”€β”€ data/
β”‚   └── users.json         # User keystroke profiles
β”œβ”€β”€ models/
β”‚   └── keystroke_model.joblib  # Trained model (generated)
β”œβ”€β”€ requirements.txt
└── README.md

πŸ”¬ Technical Details

Feature Vector (28 dimensions)

Category Features Count
Dwell Times e, a, r, i, o, t, n, s, h, l, d, g, space 13
Digraph Latencies in, th, ti, on, an, he, al, er, es 9
Trigraph Latencies the, and, are, ion, ing 5
Typing Speed chars/second 1

Machine Learning Pipeline

# Feature scaling for normalization
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Random Forest with balanced class weights
classifier = RandomForestClassifier(
    n_estimators=100,
    max_depth=10,
    class_weight='balanced',
    random_state=42
)

πŸ“Š Performance

Metric Value
Cross-validation Accuracy ~85-95%*
Minimum Users 2
Samples per User 3 recommended
Feature Dimensions 28

*Accuracy depends on number of users and sample quality

πŸ› οΈ Development

Running Tests

python -m pytest tests/

Adding New Features

  1. Add feature extraction in src/features.py
  2. Update get_feature_names() and to_vector() methods
  3. Retrain models with new feature set

πŸ“Έ Screenshots

Registration Authentication
Registration Auth
Sample Collection Analytics Dashboard
Samples Analytics

πŸ“š References

πŸ“– Documentation

Document Description
CONTRIBUTING.md How to contribute
ROADMAP.md Future enhancement plans
LICENSE MIT License

🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ‘€ Author

Charan


Made with ❀️ and ⌨️

About

A modern ML-powered biometric authentication system that identifies users based on their unique typing patterns. Built with Streamlit, scikit-learn, and Material Design 3.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages