babel-fish

Description

This project is a Python-based application that seamlessly translates and generates audio content in real time, just like the world-renowned Babel Fish. It listens to incoming audio, detects speech, identifies the language, transcribes the speech into text, translates the text, and generates audio in the target language.

Features

Two Options: One Microphone or Two Microphones To use Linux's default microphone, use LiveTranslationOneMic.py
Real-time Audio Processing: The application continuously listens to audio input and processes it in real time.
Speech Detection: Utilizes the SpeechRecognition library to detect speech and distinguish spoken content from silence.
Language Detection: Employs OpenAI Whisper to automatically detect the language of the incoming audio, ensuring accurate language identification.
Language Transcription Employs OpenAI Whisper to transcribe incoming audio.
Multilingual Translation: Translates the detected speech to the target language using Lingua, providing clear and effective communication across language barriers.
Audio Generation: Uses Bark, a text-to-speech synthesis system, to generate audio content based on the translated text.
Language Mapping: Comprehensive dictionaries and language codes enable language mapping for translation and audio generation.
Extensible and Customizable: The project's modular design allows for customization and extension to support additional languages and features.

Prerequisites

Python 3.x
Required Python packages and dependencies (specified in requirements.text)

Setup

Clone this repository to your local machine.
Install the necessary dependencies by running the following command:
```
pip install -r requirements.txt
```
Ensure that your machine meets the hardware requirements for running Bark for audio generation.

Two-Microphone Usage

To start the application, run the following command:
```
python3 LiveTranslationTwoMic.py --whisper_model [model_size]  --energy_threshold [threshold] --record_timeout [timeout] --phrase_timeout [timeout]
```
- --whisper_model: Specify the Whisper model size (choices: tiny, base, small, medium, large). The default is medium.
- --energy_threshold: Set the energy threshold for microphone detection.
- --record_timeout: Define the real-time recording duration in seconds.
- --phrase_timeout: Set the time gap between recordings to consider it a new line in the transcription.

Specify your desired microphones

# Create the microphone instances
 source1 = sr.Microphone(sample_rate=16000, device_index=microphone_index1)


 source2 = sr.Microphone(sample_rate=16000, device_index=microphone_index2)

If you are running the script on a computer with a GPU that has less than 12-15 GB of VRAM, enable Bark to run on CPU Note that this is present in the current script
```
 os.environ["SUNO_OFFLOAD_CPU"] = "True"  
 os.environ["SUNO_USE_SMALL_MODELS"] = "True"  
```
Otherwise, enable Bark to run on GPU:
```
 os.environ["SUNO_OFFLOAD_CPU"] = "False"  
 os.environ["SUNO_USE_SMALL_MODELS"] = "False"  
```
The application will continuously listen for audio input from the specified microphone source.
When speech is detected, the system automatically identifies the language and provides real-time translation.
The translated text will be displayed on the console, along with the detected language and its translation.
If the detected language supports audio generation, the application will generate and play audio based on the translated text.

One Microphone Usage

To start the application, run the following command:
```
python3 LiveTranslationOneMic.py --whisper_model [model_size]  --energy_threshold [threshold] --record_timeout [timeout] --phrase_timeout [timeout]
```
- --whisper_model: Specify the Whisper model size (choices: tiny, base, small, medium, large). The default is medium.
- --energy_threshold: Set the energy threshold for microphone detection.
- --record_timeout: Define the real-time recording duration in seconds.
- --phrase_timeout: Set the time gap between recordings to consider it a new line in the transcription.
If you are running the script on a computer on small GPU's, enable Bark to run on CPU Note that this is present in the current script
```
 os.environ["SUNO_OFFLOAD_CPU"] = "True"  
 os.environ["SUNO_USE_SMALL_MODELS"] = "True"  
```
The application will continuously listen for audio input from the specified microphone source.
When speech is detected, the system automatically identifies the language and provides real-time translation.
The translated text will be displayed on the console, along with the detected language and its translation.
If the detected language supports audio generation, the application will generate and play audio based on the translated text.

Acknowledgments

This project utilizes various open-source libraries and models, including Lingua, OpenAI Whisper, Hugging Face Transformers, Bark, SpeechRecognition, and more. I appreciate the contributions of these projects to the development of this application.

License

This project is open-source and released under the MIT License.

Author

Philip-David Medows

Contact Information

For inquiries or feedback related to this project, please contact [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
images		images
microphone-configuration-tests		microphone-configuration-tests
translation-tests		translation-tests
.gitattributes		.gitattributes
LICENSE		LICENSE
LiveTranslationOneMic.py		LiveTranslationOneMic.py
LiveTranslationTwoMic.py		LiveTranslationTwoMic.py
README.md		README.md
requirements.text		requirements.text

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

babel-fish

Description

Features

Prerequisites

Setup

Two-Microphone Usage

One Microphone Usage

Acknowledgments

License

Author

Contact Information

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

babel-fish

Description

Features

Prerequisites

Setup

Two-Microphone Usage

One Microphone Usage

Acknowledgments

License

Author

Contact Information

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages