本项目使用PyTorch复现论文:CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion, 在音色转换/声音克隆方面非常优秀的算法模型.
本项目使用CycleGAN实现语音转换(Voice Conversion),即将一个人的语音转换成另一个人的语音,或将男性的语音转换成女性的语音,反之亦然。CycleGAN是一种基于对抗生成网络(GAN)的模型,它可以自动学习如何将两个不同领域的数据进行转换,例如将照片转换成艺术作品。在这个项目中,CycleGAN被用来学习两个不同人的语音之间的映射关系,从而实现语音转换。该项目的实现基于PyTorch框架,同时使用了Mel-spectrogram特征提取和WaveNet声码器来生成转换后的语音。
- 数据集
- VC
- 中文男性说话人(S0913 from AISHELL-Speech & GaoXiaoSong: a Chinese star)
- 用法
- 训练
- Example
- Demo
To advance the research on non-parallel VC, we propose CycleGAN-VC2, which is an improved version of CycleGAN-VC incorporating three new techniques: an improved objective (two-step adversarial losses), improved generator (2-1-2D CNN), and improved discriminator (Patch GAN).
本项目包括:
- 模型代码 ,复现论文中的算法模型.
- 语音预处理,对训练数据进行处理.
- 训练代码,训练模型.
- Examples of Voice Conversion - 模型训练后的转换样本。
pip install -r requirements.txt
python preprocess_training.py
自定义参数执行:
python preprocess_training.py --train_A_dir ./data/S0913/ --train_B_dir ./data/gaoxiaosong/ --cache_folder ./cache/
python train.py
自定义参数执行:
python train.py --logf0s_normalization ./cache/logf0s_normalization.npz --mcep_normalization ./cache/mcep_normalization.npz --coded_sps_A_norm ./cache/coded_sps_A_norm.pickle --coded_sps_B_norm ./cache/coded_sps_B_norm.pickle --model_checkpoint ./model_checkpoint/ --resume_training_at ./model_checkpoint/_CycleGAN_CheckPoint --validation_A_dir ./data/S0913/ --output_A_dir ./converted_sound/S0913 --validation_B_dir ./data/gaoxiaosong/ --output_B_dir ./converted_sound/gaoxiaosong/
a pretrained model which converted between S0913 and GaoXiaoSong
download from Google Drive <735MB>
使用预训练模型转换的样本:
说话人A: S0913(./data/S0913/BAC009S0913W0351.wav)
说话人B: GaoXiaoSong(./data/gaoxiaosong/gaoxiaosong_1.wav)
说话人A的语音转换为说话人B的音色: Converted from S0913 to GaoXiaoSong (./converted_sound/S0913/BAC009S0913W0351.wav)
- CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion. Paper, Project
- Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. Paper, Project
- Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. Paper, Project, Code
- Image-to-Image Translation with Conditional Adversarial Nets. Paper, Project, Code
If this project help you reduce time to develop, you can give me a cup of coffee :)
AliPay(支付宝)
WechatPay(微信)
MIT © Kun