Video to SRT Subtitle Translator

This application transcribes speech from a video or audio file (single file or whole folder), optionally translates it to a target language, and generates SRT subtitle files. It uses OpenAI Whisper for transcription and Hugging Face Transformers for translation.

Installation

Clone the repository:

git clone https://github.com/yourusername/video_translator.git
cd video_translator

Install Conda (if not already installed):
Miniconda download page
Create and activate the environment:
```
conda env create -f environment.yml
conda activate video_translator_env
```
Note: ensure ffmpeg is listed in environment.yml so ffmpeg and ffprobe are installed into the environment:
```
dependencies:
  - python=3.9
  - ffmpeg
  # ...
```
(Optional) If you want to use CUDA, ensure you have the appropriate NVIDIA drivers installed.

Usage

Run the CLI after activating the environment:

python src/main.py --input-video path/to/video.mp4 --output-srt path/to/output.srt --output-lang en

or use conda run:

conda run -n video_translator_env python src/main.py --input-video path/to/video.mp4 -o out.srt -ol es

You can also transcribe an entire folder of media files (batch mode) with --input-folder. See examples below.

Use --help for full option list:

python src/main.py --help

Command-Line Arguments (high level)

-i, --input-video PATH : Path to input video or audio file (single-file mode).
-I, --input-folder PATH : Path to a folder containing media files. All supported files in the folder will be processed (batch mode).
-o, --output-srt PATH : Path to save output SRT (single-file mode). If omitted, SRT will be placed next to input with the same name and .srt extension.
-od, --output-dir PATH : Target directory for SRT files when using --input-folder. If omitted, each SRT is placed next to its source file.
-il, --input-lang TEXT : Input language code (e.g., 'en', 'ru'). If not provided, Whisper will auto-detect.
-ol, --output-lang TEXT : Output language code for translation (e.g., 'es', 'fr'). Required unless --transcribe-only is set.
-t, --transcribe-only : Only transcribe the input, do not translate.
-wm, --whisper-model TEXT : Whisper model size (tiny, base, small, medium, large, ...).
-td, --translator-device TEXT : Device for translation model (cuda or cpu).
-trd, --transcriber-device TEXT : Device for transcription model (cuda or cpu).
--temp-audio-dir PATH : Directory for temporary audio files (default: temp_audio).
-ldo, --lang-detect-offset TEXT : Timestamp offset to start a short slice for language detection (e.g. 90, 01:30, 00:01:30).

Features

Transcription: Uses OpenAI Whisper for high-quality speech-to-text conversion
Translation: Leverages Hugging Face Transformers for accurate language translation
Batch Processing: Process entire folders of media files automatically
Language Detection: Automatic source language detection with optional manual hints
Multiple Formats: Supports common video and audio formats
SRT Output: Generates standard SRT subtitle files with proper timing
GPU Acceleration: CUDA support for faster processing (optional)
Repetition Filtering: Automatically detects and cleans up Whisper's repetitive transcription loops
GUI Integration: KDE context menu integration for easy right-click processing

Supported Formats

Supported common video formats: .mp4, .mkv, .mov, .avi, .wmv
Supported audio formats: .mp3, .wav, .flac, .aac, .m4a

The CLI first checks file extensions and will fall back to ffprobe to detect supported media when the extension is unknown. Ensure ffmpeg/ffprobe are available in your environment.

Examples

Single-file transcribe + translate:

python src/main.py -i myvideo.mp4 -o subtitles.srt -ol es

Single-file transcribe only:

python src/main.py -i myaudio.mp3 -o transcript.srt --transcribe-only

Batch: process all supported files in a folder, write SRTs next to each source:

python src/main.py -I /path/to/media_folder -td cpu -trd cpu -ol es

Batch: process folder and place all SRTs in a target directory:

python src/main.py -I /path/to/media_folder -od /path/to/output_srt_dir -ol es

Batch example for MKV files mixed with others (folder mode filters supported extensions):

python src/main.py -I /home/agentic/videos -od /home/agentic/videos/subtitles -ol en

Notes:

For single-file mode, if -o/--output-srt is omitted the SRT file will be created next to the source file with the .srt extension.
For folder mode, each processed file will produce an SRT with the original filename stem and .srt extension.

KDE Desktop Integration

For KDE users, the project includes a context menu integration that allows you to right-click on video files and translate them directly from Dolphin file manager.

Setup (automatic with the provided script):

A service menu file is installed to ~/.local/share/kio/servicemenus/video-translator.desktop
Right-click on any video/audio file → "Video Translator" submenu
Choose your target language or "Transcribe Only"
A terminal window shows progress and results

Available Options:

Translate to English
Translate to Spanish
Translate to French
Transcribe Only (no translation)

The context menu automatically handles file paths with spaces and provides visual feedback during processing.

Troubleshooting

If you get ffmpeg/ffprobe not found errors, ensure ffmpeg is installed and on PATH in the conda env:
```
conda activate video_translator_env
conda install ffmpeg
ffmpeg -version
ffprobe -version
```
If the tool refuses a file, run ffprobe path/to/file to confirm ffmpeg recognizes it.
Repetitive Text in Transcripts: The tool automatically detects and cleans up Whisper's repetitive loops (like "yeah, yeah, yeah..." repeated dozens of times). The cleaning process:
- Limits consecutive word repetitions to a maximum of 5 instances
- Detects segments where a single word makes up >70% of the content
- Replaces obviously looped segments with "[unclear audio]"
- Preserves natural speech patterns and reasonable repetitions

KDE Context Menu Issues: If the right-click menu doesn't appear or work:

# Rebuild KDE service cache
kbuildsycoca5
# Check if the service file exists
ls ~/.local/share/kio/servicemenus/video-translator.desktop

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
core		core
src		src
.gitignore		.gitignore
README.md		README.md
app-icon.svg		app-icon.svg
environment.yml		environment.yml
kde_wrapper.sh		kde_wrapper.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
ui_launcher.py		ui_launcher.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video to SRT Subtitle Translator

Installation

Usage

Command-Line Arguments (high level)

Features

Supported Formats

Examples

KDE Desktop Integration

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Video to SRT Subtitle Translator

Installation

Usage

Command-Line Arguments (high level)

Features

Supported Formats

Examples

KDE Desktop Integration

Troubleshooting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages