Skip to content

Latest commit

Β 

History

History
37 lines (19 loc) Β· 1.88 KB

File metadata and controls

37 lines (19 loc) Β· 1.88 KB

SightSync πŸ‘€πŸ—£πŸ§ 

Welcome to the SightSync repository - a platform dedicated to assisting visually impaired individuals by delivering natural, accurate verbal descriptions of their surroundings. This project was developed with lots of ❀️ by Oriol and Ferran for LauzHack2023 πŸ†

πŸ“· What is SightSync?

SightSync is an application that uses open-source models πŸ’‘ to assist visually-impaired individuals navigate their environment with greater ease and autonomy. It's essentially their β€˜eyes’ with their voice description capabilities, feeding visual data through auditory channels.

πŸ›  How do we do that?

We've built SightSync using the following state-of-the-art open-source models:

  1. Zephyr: For understanding live environment scene structure at scale. The model (LLM) allows us to make sense of what the user is asking for, an essential feature for our cause πŸ”

  2. Distil-whisper: To convert spoken language into text format (STT) 🎀

  3. FastPitch: For converting text description into voice(TTS) 🎧

  4. GroundingDino: To provide item location details in a scenario, adding another layer of detail to our descriptive capabilities πŸ“

  5. CogVLM: To generate accurate context-aware description of surroundings, allowing us to craft immersive auditory experiences that accurately reflect an individual's environment 🌍

Please note, all models are hosted on-prem 🏭

πŸ’ Contribute

As an open-source project, we welcome anyone who would like to contribute. We believe that every contribution, no matter how small, can make a big difference!

πŸ“¬ Contact

If you have a feature request, bug report, or just want to chat, don't hesitate to get in touch with us:

Oriol Agost - [email protected] Ferran Aran - [email protected]

Thank you for your interest in SightSync. We're excited to see where we can go together! 🌟