CycleGAN-VC2-PyTorch

本项目使用PyTorch复现论文：CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion, 在音色转换/声音克隆方面非常优秀的算法模型.

本项目使用CycleGAN实现语音转换（Voice Conversion），即将一个人的语音转换成另一个人的语音，或将男性的语音转换成女性的语音，反之亦然。CycleGAN是一种基于对抗生成网络（GAN）的模型，它可以自动学习如何将两个不同领域的数据进行转换，例如将照片转换成艺术作品。在这个项目中，CycleGAN被用来学习两个不同人的语音之间的映射关系，从而实现语音转换。该项目的实现基于PyTorch框架，同时使用了Mel-spectrogram特征提取和WaveNet声码器来生成转换后的语音。

数据集
- VC
- 中文男性说话人(S0913 from AISHELL-Speech & GaoXiaoSong: a Chinese star)
用法
- 训练
- Example
Demo

CycleGAN-VC2

论文项目主页

To advance the research on non-parallel VC, we propose CycleGAN-VC2, which is an improved version of CycleGAN-VC incorporating three new techniques: an improved objective (two-step adversarial losses), improved generator (2-1-2D CNN), and improved discriminator (Patch GAN).

本项目包括:

模型代码，复现论文中的算法模型.
语音预处理，对训练数据进行处理.
训练代码，训练模型.
Examples of Voice Conversion - 模型训练后的转换样本。

内容列表

CycleGAN-VC2-PyTorch

依赖

pip install -r requirements.txt

用法

预处理

python preprocess_training.py

自定义参数执行：

python preprocess_training.py --train_A_dir ./data/S0913/ --train_B_dir ./data/gaoxiaosong/ --cache_folder ./cache/

训练

python train.py

自定义参数执行：

python train.py --logf0s_normalization ./cache/logf0s_normalization.npz --mcep_normalization ./cache/mcep_normalization.npz --coded_sps_A_norm ./cache/coded_sps_A_norm.pickle --coded_sps_B_norm ./cache/coded_sps_B_norm.pickle --model_checkpoint ./model_checkpoint/ --resume_training_at ./model_checkpoint/_CycleGAN_CheckPoint --validation_A_dir ./data/S0913/ --output_A_dir ./converted_sound/S0913 --validation_B_dir ./data/gaoxiaosong/ --output_B_dir ./converted_sound/gaoxiaosong/

预训练模型

a pretrained model which converted between S0913 and GaoXiaoSong

download from Google Drive <735MB>

Demo

使用预训练模型转换的样本:

说话人A: S0913(./data/S0913/BAC009S0913W0351.wav)

说话人B: GaoXiaoSong(./data/gaoxiaosong/gaoxiaosong_1.wav)

说话人A的语音转换为说话人B的音色: Converted from S0913 to GaoXiaoSong (./converted_sound/S0913/BAC009S0913W0351.wav)

Star-History

引用

CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion. Paper, Project
Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. Paper, Project
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. Paper, Project, Code
Image-to-Image Translation with Conditional Adversarial Nets. Paper, Project, Code

捐赠

If this project help you reduce time to develop, you can give me a cup of coffee :)

AliPay(支付宝)

WechatPay(微信)

License

MIT © Kun

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.zh-CN.md

README.zh-CN.md

CycleGAN-VC2-PyTorch

CycleGAN-VC2

论文项目主页

内容列表

依赖

用法

预处理

训练

预训练模型

Demo

Star-History

引用

捐赠

License

Files

README.zh-CN.md

Latest commit

History

README.zh-CN.md

File metadata and controls

CycleGAN-VC2-PyTorch

CycleGAN-VC2

论文项目主页

内容列表

依赖

用法

预处理

训练

预训练模型

Demo

Star-History

引用

捐赠

License