v0.3.0 by erogol · Pull Request #803 · coqui-ai/TTS

erogol · 2021-09-13T07:39:11Z

🐸 v0.3.0

New `ForwardTTS` implementation.

This version implements a new ForwardTTS interface that can be configured as any feed-forward TTS model that uses a duration predictor at inference time. Currently, we provide 3 pre-configured models and plan to implement one more.

SpeedySpeech
FastSpeech
FastPitch
FastSpeech 2 (TODO)

Through this API, any model can be trained in two ways. Either using pre-computed durations from a pre-trained Tacotron model or using an alignment network to learn durations from the dataset. The alignment network is only used at training and discarded at inference. You can set which mode you want to use by just setting the use_aligner field in the configuration.

This new API will help us to design more efficient inference run-time for all these models using ONNX like run-time optimizers.

Old FastPitch and SpeedySpeech implementations are deprecated for the sake of this new implementation.

Fine-Tuning Documentation

This version introduces documentation for model fine-tunning. You can see it under https://tts.readthedocs.io/ when this is merged.

- Generic API for feed-forward TTS models (FastPitch, SpeedySpeech) - Tests for `forward-tts` - Edit FastPitchConfig and SpeedySpeechConfig to use `forward_tts`

Forward TTS implementation

SpeedySpeech model using `ForwardTTS` UnivNet model fine-tuned on TacotronDDC_ph spectrograms

erogol added 30 commits September 7, 2021 08:01

Update README.md with new models

bb2c3df

Update .gitignore

674c72b

Update comment and add a warning

e20ea57

Add FastPitch documentation

b0b96b4

Fix imports

4761853

Stage TTS.tts.utils.helpers

537c857

Fix extract_tts_spectrograms.py model init

807f1d3

Fix trainer's scheduler restoring

6c4c106

Fix logging current learning rate in trainer

abf5e48

Fix best_model_path init if no best_mode

2dfc5bd

Move MAS to TTS.tts.utils.helpers

bfc6cea

Update notebook compat

1de010a

Style extract_tts_spectrogram.py

3c740d4

Implement forward_tts

8b7e094

- Generic API for feed-forward TTS models (FastPitch, SpeedySpeech) - Tests for `forward-tts` - Edit FastPitchConfig and SpeedySpeechConfig to use `forward_tts`

Test TTS.tts.utils.helpers

ed4b1d8

Warn user if nan in GL

742f9c5

Fix Vits imports

3c16013

Remove fastpitch.py and speedy_speech.py

0541a25

Fix GPU init in tests

3abc3a1

Implement ForwardTTSLoss

570d597

Fix glow_tts imports

a89eb12

Style update

d6e29ef

Add LJSpeech SpeedySpeech recipe

22822cd

Add base_model field to forward_tts configs

6673202

Edit AlignTTS

ab37fa9

Update tacotron r init

d5f256b

Use glow-tts in synthesis tests

7d8f773

Remove speedy_speech implementation

1ebf9ec

Skip TF tests on GPU

7ec23e6

Remove unused import

d979526

erogol and others added 8 commits September 10, 2021 17:47

Remove SpeedySpeech from .models.json

26f76fc

Test FastPitch train

1e7db32

Add FastSpeechConfig

cbbc9e0

Add docs to Makefile

bb69e71

Update SpeedySpeech config

1ea0115

Add forward_tts docs

edc8d4d

Add fine-tunning documentation

69dd36e

Merge pull request #800 from coqui-ai/forward_tts

aed9a32

Forward TTS implementation

erogol added the 🚀 new version label Sep 13, 2021

erogol added 3 commits September 13, 2021 08:22

Add new models to .models.json

91bebeb

SpeedySpeech model using `ForwardTTS` UnivNet model fine-tuned on TacotronDDC_ph spectrograms

Fix trainer malformatted print

a97dc8d

Bump up to v0.3.0

f563415

erogol merged commit 0592a58 into main Sep 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.0#803

v0.3.0#803
erogol merged 41 commits intomainfrom
dev

erogol commented Sep 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erogol commented Sep 13, 2021

🐸 v0.3.0

New ForwardTTS implementation.

Fine-Tuning Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New `ForwardTTS` implementation.