I have tried training with other than the cpc feature on my prepared corpus.
However, the training script fails when the loss function (train.py , line 69).
I found that the size of the output vector out is hard-coded, which is inconsistent with the size of the target Mel spectrogram of other features.
The size of some vectors of the model are:
- apc case:
Input dim: 512, Reference dim: 512, Target dim: 240
- cpc case:
Input dim: 256, Reference dim: 256, Target dim: 80
I prepared the input feature vectors by using preprocess.py, e.g. python .\preprocess.py (my own corpus) apc .\checkpoints\wav2vec_small.pt processed/apc.
I have modified the model by changing the size of the vectors and can run train.py now.
In the model.py, __init__() of S2VC function, I replace 80 with a function argument and pass the size of Mel vector size.
But I cannot determine the modification is appropriate, for I am not familiar with NLP.
convert_batch.py with pre-trained models works well as you described in README.md.
Other details of my situation are:
- Windows 10, PowerShell
- pytorch 1.7.1 + cu110
- torchaudio 0.7.1
- sox 1.4.1
- tqdm 4.42.0
- librosa 0.8.1
I have tried training with other than the cpc feature on my prepared corpus.
However, the training script fails when the loss function (
train.py, line 69).I found that the size of the output vector
outis hard-coded, which is inconsistent with the size of the target Mel spectrogram of other features.The size of some vectors of the model are:
Input dim: 512, Reference dim: 512, Target dim: 240Input dim: 256, Reference dim: 256, Target dim: 80I prepared the input feature vectors by using
preprocess.py, e.g.python .\preprocess.py (my own corpus) apc .\checkpoints\wav2vec_small.pt processed/apc.I have modified the model by changing the size of the vectors and can run
train.pynow.In the
model.py,__init__()ofS2VCfunction, I replace80with a function argument and pass the size of Mel vector size.But I cannot determine the modification is appropriate, for I am not familiar with NLP.
convert_batch.pywith pre-trained models works well as you described inREADME.md.Other details of my situation are: