TTS Reader

Leggi in italiano

A tool for reading text files aloud in Italian, with both a web interface and a command-line interface. Supports Markdown, TXT, EPUB, DOCX, HTML, and PDF with 5 Italian neural voices. The default voice (Giuseppe) is multilingual and correctly pronounces English terms within Italian text.

Features

Read text files aloud (Markdown, TXT, EPUB, DOCX, HTML, PDF), paragraph by paragraph
8 voices: 7 online (Edge TTS) + 1 offline (Piper TTS)
Italian and English voices with multilingual support
4 reading styles: Neutral, Newscast, Audiobook, Slow (Edge TTS only)
Web interface with audio player (play, pause, stop, previous, next, repeat)
Bilingual UI: Italian and English with language switcher
CLI for terminal usage
Save audio as MP3 (single file + individual paragraphs)
Smart prefetch: synthesizes the next paragraph during playback
Automatic Markdown-to-plain-text conversion (via pandoc or regex fallback)

System Requirements

Python 3.10+
ffmpeg (audio conversion and playback)
pandoc (optional, improves Markdown conversion)

The CLI automatically detects the operating system and uses the appropriate audio player: aplay on Linux, afplay on macOS, ffplay on Windows. If the native player is unavailable, ffplay (included with ffmpeg) is used as a fallback on all systems.

Installation

System Dependencies

Linux (Debian/Ubuntu)

sudo apt install ffmpeg alsa-utils pandoc

Linux (Fedora)

sudo dnf install ffmpeg alsa-utils pandoc

Linux (Arch)

sudo pacman -S ffmpeg alsa-utils pandoc

macOS

brew install ffmpeg pandoc

afplay is already included in macOS.

Windows

# With Chocolatey
choco install ffmpeg pandoc

# Or with Scoop
scoop install ffmpeg pandoc

Automated Setup (Recommended)

git clone https://github.com/AndreaBonn/text-to-speech.git
cd text-to-speech

# Linux/macOS
bash scripts/setup.sh

# Windows (PowerShell)
powershell -ExecutionPolicy Bypass -File scripts\setup.ps1

The script checks for Python 3.10+, verifies system dependencies, creates a virtual environment, and installs Python packages.

Manual Setup

git clone https://github.com/AndreaBonn/text-to-speech.git
cd text-to-speech
python -m venv venv
source venv/bin/activate        # Linux/macOS
# venv\Scripts\activate         # Windows (cmd)
# venv\Scripts\Activate.ps1     # Windows (PowerShell)
pip install -r requirements.txt

Usage

Web Interface

source venv/bin/activate
python app.py

Open http://localhost:5000 in your browser. The interface allows you to:

Upload a file from disk (MD, TXT, EPUB, DOCX, HTML, PDF)
Choose a voice from the dropdown menu
Select a reading style (Neutral, Newscast, Audiobook, Slow)
Switch between Italian and English UI using the language buttons (IT/EN)
Use player controls: play/pause, stop, previous, next, repeat
Click the progress bar to jump to any paragraph
Download the complete audio as MP3

Keyboard shortcuts: Space play/pause, Left/Right arrow prev/next, R repeat.

The reading style is automatically suggested based on file format:

EPUB files → Audiobook style (relaxed pace)
Markdown/HTML → Newscast style (faster pace)
Other formats → Neutral style

Command Line

source venv/bin/activate

# Read with default voice (Giuseppe, multilingual)
python leggi.py file.md

# Choose a voice
python leggi.py file.md --voice isabella

# English voices
python leggi.py document.md --voice andrew
python leggi.py document.md --voice ava

# Offline voice (no internet required)
python leggi.py file.md --voice paola

# Save as MP3
python leggi.py file.md --voice giuseppe --salva

Note: Reading styles are only available in the web interface. The CLI uses the default neutral style for all voices.

With --salva, the following structure is created:

data/output/<filename>/
├── full/<filename>.mp3       # Complete audio
└── paragraphs/
    ├── 001.mp3                # Individual paragraphs
    ├── 002.mp3
    └── ...

Available Voices

Voice	Engine	Gender	Language	Multilingual	Requires Internet
giuseppe	Edge TTS	Male	Italian	Yes (IT/EN)	Yes
isabella	Edge TTS	Female	Italian	No	Yes
elsa	Edge TTS	Female	Italian	No	Yes
diego	Edge TTS	Male	Italian	No	Yes
andrew	Edge TTS	Male	English	Yes (EN/IT)	Yes
ava	Edge TTS	Female	English	Yes (EN/IT)	Yes
ryan	Edge TTS	Male	English	No	Yes
paola	Piper TTS	Female	Italian	No	No (offline)

Giuseppe is recommended for technical texts with English terms. Paola works without an internet connection (the model is automatically downloaded on first use, approximately 60 MB).

Reading Styles

The web interface offers 4 reading styles (Edge TTS voices only):

Style	Speed	Pitch	Best For
Neutral	Normal	Normal	General reading
Newscast	+13%	+5Hz	News, articles, fast-paced content
Audiobook	-8%	-3Hz	Books, relaxed listening
Slow	-20%	Normal	Study, comprehension, language learning

The style is automatically suggested based on file format (e.g., EPUB → Audiobook, MD → Newscast).

Security

This project implements multiple security layers. See SECURITY.md for a detailed overview of all mechanisms in place.

Project Structure

text-to-speech/
├── app.py              # Flask web server
├── tts_engine.py       # TTS engine with cache and prefetch
├── synthesis.py        # Speech synthesis functions (Piper, Edge)
├── config.py           # Voice configuration, model paths, constants
├── translations.py     # Backend translations (API messages, styles)
├── leggi.py            # CLI: terminal reading
├── converters.py       # Format converters → plain text
├── static/
│   ├── style.css       # Design system (Ink & Amber)
│   ├── player.js       # JavaScript audio player
│   └── i18n.js         # Frontend i18n system (IT/EN)
├── templates/
│   └── index.html      # Web interface (HTML only)
├── tests/
│   ├── conftest.py     # Shared fixtures (client, engine)
│   └── test_*.py       # Test suite (pytest)
├── scripts/
│   ├── setup.sh       # Automated setup Linux/macOS
│   └── setup.ps1      # Automated setup Windows
├── data/
│   ├── input/          # Source files to read
│   └── output/         # Audio generated with --salva
├── requirements.txt    # Python dependencies
└── README.md

License

This project is released under the GPL-3.0 license. See the LICENSE file for details.

Disclaimers

Edge TTS voices (giuseppe, isabella, elsa, diego, andrew, ava, ryan) use an unofficial Microsoft Edge "Read Aloud" API. This service is not guaranteed and may stop working at any time. It is not authorized for commercial use. For commercial applications, consider Azure AI Speech.

Single-user: the web application uses a single shared TTS engine instance across all requests. It is not designed for simultaneous multi-user access. If multiple users upload files concurrently, data will be overwritten.

The Paola voice (Piper TTS) is fully offline and free from restrictions (training dataset under CC0 public domain license).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TTS Reader

Features

System Requirements

Installation

System Dependencies

Automated Setup (Recommended)

Manual Setup

Usage

Web Interface

Command Line

Available Voices

Reading Styles

Security

Project Structure

License

Disclaimers

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
data		data
scripts		scripts
src		src
static		static
templates		templates
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.it.md		README.it.md
README.md		README.md
SECURITY.it.md		SECURITY.it.md
SECURITY.md		SECURITY.md
app.py		app.py
leggi.py		leggi.py
requirements.txt		requirements.txt
ruff.toml		ruff.toml

Folders and files

Latest commit

History

Repository files navigation

TTS Reader

Features

System Requirements

Installation

System Dependencies

Automated Setup (Recommended)

Manual Setup

Usage

Web Interface

Command Line

Available Voices

Reading Styles

Security

Project Structure

License

Disclaimers

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages