README-zh.md

NATSpeech: A Non-Autoregressive Text-to-Speech Framework

本仓库包含了以下工作的官方PyTorch实现：

PortaSpeech: Portable and High-Quality Generative Text-to-Speech (NeurIPS 2021)
Demo页面 | HuggingFace🤗 Demo
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (DiffSpeech) (AAAI 2022)
Demo页面 | 项目主页 | HuggingFace🤗 Demo

主要特点

我们在本框架中实现了以下特点：

基于Montreal Forced Aligner的非自回归语音合成数据处理流程；
便于使用和可扩展的训练和测试框架；
简单但有效的随机访问数据集类的实现。

安装依赖

## 在 Linux/Ubuntu 18.04 上通过测试 
## 首先需要安装 Python 3.6+ (推荐使用Anaconda)

export PYTHONPATH=.
# 创建虚拟环境 (推荐).
python -m venv venv
source venv/bin/activate
# 安装依赖
pip install -U pip
pip install Cython numpy==1.19.1
pip install torch==1.9.0 # 推荐 torch >= 1.9.0
pip install -r requirements.txt
sudo apt install -y sox libsox-fmt-mp3
bash mfa_usr/install_mfa.sh # 安装强制对齐工具

文档

引用

如果本REPO对你的研究和工作有用，请引用以下论文：

PortaSpeech

@article{ren2021portaspeech,
  title={PortaSpeech: Portable and High-Quality Generative Text-to-Speech},
  author={Ren, Yi and Liu, Jinglin and Zhao, Zhou},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

DiffSpeech

@article{liu2021diffsinger,
  title={Diffsinger: Singing voice synthesis via shallow diffusion mechanism},
  author={Liu, Jinglin and Li, Chengxi and Ren, Yi and Chen, Feiyang and Liu, Peng and Zhao, Zhou},
  journal={arXiv preprint arXiv:2105.02446},
  volume={2},
  year={2021}
 }

致谢

我们的代码受以下代码和仓库启发：

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NATSpeech: A Non-Autoregressive Text-to-Speech Framework

主要特点

安装依赖

文档

引用

致谢

FilesExpand file tree

README-zh.md

Latest commit

History

README-zh.md

File metadata and controls

NATSpeech: A Non-Autoregressive Text-to-Speech Framework

主要特点

安装依赖

文档

引用

致谢