improve the librispeech recipe by sw005320 · Pull Request #354 · espnet/espnet

sw005320 · 2018-08-14T12:27:51Z

Now I'm working on the improvement of the librispeech recipe motivated by the RWTH setup (thanks to Rohit Prabhavalkar and Kazuki Irie)

sentencepiece model as a default
VGG-BLSTM
shallow and wide network (3layer BLSTMS with 1024 units (encoder), 1024 dim attention, unidirectional LSTM with 1024 units (decoder)
fast convergence (change the maximum of epochs from 15 to 10)
significant WER improvement (from 7.2 to 5.1 in test_clean)

TODO

check other configurations (adim, and eprojs may not have to be large?)
increase the decoder layer from 3 to x
increase the decoder layer from 1 to 2
tune some search parameters (rnnlm weight)
upload a model (Is there any already trained model? #322)

chenzhehuai · 2018-08-15T07:37:26Z

It is consistent with my observation except that adding more layers up to 6 with 1024 units still obtains some improvement.

sw005320 · 2018-08-15T15:42:33Z

Thanks. Will follow your suggestions (when GPUs are available). Could you share your WER results? If it is better than https://github.com/espnet/espnet/blob/d51e76c0baa556e28a3e090335944478828fbc65/egs/librispeech/asr1/RESULTS, then, I may follow your network architecture, or ask you to make a PR.

sw005320 · 2018-08-16T01:43:30Z

The trained models can be provided through the release (e.g., https://github.com/espnet/espnet/releases/download/untagged-f5ccde023841a43380a9/librispeech_asr1.tgz)

chenzhehuai · 2018-08-17T11:52:23Z

No, my observation is from train_100. I think you system is the best til now.

ruizhilijhu · 2018-09-07T02:22:31Z

@sw005320 Hi Shinji, for RWTH setup, is there any published paper for this? I couldn't find one online.

sw005320 · 2018-09-07T06:54:30Z

https://arxiv.org/pdf/1805.03294
This is not exactly same as what we're now using, but our set up is based on the discussion with them what would be most dominant given our current implementation.

ruizhilijhu · 2018-09-10T16:37:10Z

From this paper, their system played with the subsampling factor 32 in the pertaining and 8 for fine-tuning and they showed some improvement.

In our babel-10 setting, we used subsampling factor 4, and we allowed the minimum number of frames in an utterance to be 10 frames.

Is there any specific reason use our current setting?

sw005320 · 2018-09-10T23:07:10Z

We don't have a specific reason, and we may test further subsampling like the RWTH paper, but I internally found that further subsampling (8) slightly degrades the performance in the Librispeech task.

sw005320 added 2 commits August 14, 2018 08:20

improve the librispeech recipe

d51e76c

updated README.md

a918967

This was referenced Aug 14, 2018

Subword recipe for CSJ #326

Merged

Development plan for v.0.2.0 #334

Closed

sw005320 added 2 commits August 23, 2018 18:31

update the librispeech result

f17dbd8

Merge remote-tracking branch 'upstream/master' into improve_librispeech

60a027b

sw005320 changed the title ~~[WIP] improve the librispeech recipe~~ improve the librispeech recipe Aug 24, 2018

Merge branch 'master' into improve_librispeech

804f73f

sw005320 merged commit 168a9e9 into espnet:master Aug 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve the librispeech recipe#354

improve the librispeech recipe#354
sw005320 merged 5 commits intoespnet:masterfrom
sw005320:improve_librispeech

sw005320 commented Aug 14, 2018 •

edited

Loading

Uh oh!

chenzhehuai commented Aug 15, 2018

Uh oh!

sw005320 commented Aug 15, 2018

Uh oh!

sw005320 commented Aug 16, 2018

Uh oh!

chenzhehuai commented Aug 17, 2018

Uh oh!

ruizhilijhu commented Sep 7, 2018

Uh oh!

sw005320 commented Sep 7, 2018

Uh oh!

ruizhilijhu commented Sep 10, 2018

Uh oh!

sw005320 commented Sep 10, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sw005320 commented Aug 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chenzhehuai commented Aug 15, 2018

Uh oh!

sw005320 commented Aug 15, 2018

Uh oh!

sw005320 commented Aug 16, 2018

Uh oh!

chenzhehuai commented Aug 17, 2018

Uh oh!

ruizhilijhu commented Sep 7, 2018

Uh oh!

sw005320 commented Sep 7, 2018

Uh oh!

ruizhilijhu commented Sep 10, 2018

Uh oh!

sw005320 commented Sep 10, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sw005320 commented Aug 14, 2018 •

edited

Loading