HATSEYE

AI-powered vision assistant for visually impaired users
Voice-activated object recognition with real-time camera analysis and haptic obstacle feedback

Overview

Traditional vision assistance apps often require constant interaction and aren’t optimized for quick, one-sentence responses or physical proximity awareness.
That’s why we built HATSEYE: a voice-activated assistant that combines AI-powered vision with ultrasonic sensing + rumble motor haptics to help visually impaired users understand and navigate their environment.

HATSEYE integrates voice recognition, real-time camera capture, the Google Gemini Vision API, and Arduino-based hardware feedback. Say “hey hatseye” to activate, ask a question naturally, get a clear one-sentence spoken response, and feel haptic feedback that indicates how close obstacles are.

Features

Voice activation with a natural wake word (“hey hatseye”)
Real-time camera analysis for object and scene understanding
AI-powered responses via Google Gemini Vision API
One-sentence answers designed for speed and clarity
Text-to-speech output for hands-free use
Hardware obstacle feedback using ultrasonic sensors + rumble motors (haptics)
Arduino + serial integration for live sensor/motor data
Simple web interface for camera preview and system status

Architecture

Voice + Vision Pipeline
Voice Input → Wake Word Detection → Question Transcription → Camera Frame Capture → Gemini Vision → One-Sentence Answer → Text-to-Speech → Audio Response

Haptic Pipeline
Ultrasonic Sensors → Arduino → Serial Data → Backend Processing → Rumble Motor Feedback

Tech Stack

Category	Technologies
Frontend	HTML, CSS, JavaScript, Web Speech API
Backend	Python, Flask, OpenCV
AI	Google Gemini Vision API
Text-to-Speech	ElevenLabs
Hardware	Arduino, Ultrasonic Sensors, Rumble Motors
Communication	Serial (PySerial)

How It Works

User says “hey hatseye” to activate.
HATSEYE listens for a question about the current scene.
The system captures a camera frame in real time.
The image is sent to Gemini Vision for analysis.
Gemini returns a single-sentence response.
The response is spoken aloud via text-to-speech.
Meanwhile, ultrasonic sensors measure distance to obstacles.
Rumble motors provide haptic feedback based on proximity.

Future Roadmap

Mobile app support
Offline/on-device processing for privacy and lower latency
Multi-language support
Improved haptic patterns and wearable form factors
More customization for response style and feedback strength

Team

Member
Ryan Gao
Ethan Yang

Links

Devpost submission: https://devpost.com/software/hatseye

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.venv-roboflow		.venv-roboflow
public		public
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
activate_venv.bat		activate_venv.bat
arduino_integrated.ino		arduino_integrated.ino
arduino_serial.py		arduino_serial.py
arduino_test_dummy.ino		arduino_test_dummy.ino
list_models.py		list_models.py
monitor_motors.py		monitor_motors.py
requirements.txt		requirements.txt
roboflow_detector.py		roboflow_detector.py
roboflow_http.py		roboflow_http.py
test_roboflow.py		test_roboflow.py
test_roboflow_output.jpg		test_roboflow_output.jpg
web_gui.py		web_gui.py
webcam_gemini.py		webcam_gemini.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HATSEYE

Overview

Features

Architecture

Tech Stack

How It Works

Future Roadmap

Team

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HATSEYE

Overview

Features

Architecture

Tech Stack

How It Works

Future Roadmap

Team

Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages