Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper

This implementation is based on CUT, thanks Taesung and Junyan for sharing codes.

We provide a PyTorch implementation of non-parallel voice conversion based on patch-wise contrastive learning and adversarial learning. Compared to baseline CycleGAN-VC, CVC only requires one-way GAN training when it comes to non-parallel one-to-one voice conversion, while improving speech quality and reducing training time.

Prerequisites

Linux or macOS
Python 3
CPU or NVIDIA GPU + CUDA CuDNN

Kick Start

Clone this repo:

git clone https://github.com/Tinglok/CVC
cd CVC

Install PyTorch 1.6 and other dependencies.

For pip users, please type the command pip install -r requirements.txt.

For Conda users, you can create a new Conda environment using conda env create -f environment.yaml.
Download pre-trained Parallel WaveGAN vocoder to ./checkpoints/vocoder.

CVC Training and Test

Download the VCTK dataset

cd dataset
wget https://datashare.is.ed.ac.uk/download/DS_10283_2651.zip
unzip DS_10283_2651.zip
unzip VCTK-Corpus.zip
cp -r ./VCTK-Corpus/wav48/p* ./voice/trainA
cp -r ./VCTK-Corpus/wav48/p* ./voice/trainB

where the speaker folder could be any speakers (e.g. p256, and p270).

Train the CVC model:

python train.py --dataroot ./datasets/voice --name CVC

The checkpoints will be stored at ./checkpoints/CVC/.

Test the CVC model:

python test.py --dataroot ./datasets/voice --validation_A_dir ./datasets/voice/trainA --output_A_dir ./checkpoints/CVC/converted_sound

The converted utterance will be saved at ./checkpoints/CVC/converted_sound.

Baseline CycleGAN-VC Training and Test

Train the CycleGAN-VC model:

python train.py --dataroot ./datasets/voice --name CycleGAN --model cycle_gan

Test the CycleGAN-VC model:

python test.py --dataroot ./datasets/voice --validation_A_dir ./datasets/voice/trainA --output_A_dir ./checkpoints/CycleGAN/converted_sound --model cycle_gan

The converted utterance will be saved at ./checkpoints/CycleGAN/converted_sound.

Pre-trained CVC Model

Pre-trained models on p270-to-p256 and many-to-p249 are avaliable at this URL.

TensorBoard Visualization

To view loss plots, run tensorboard --logdir=./checkpoints and click the URL https://localhost:6006/.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{li2021cvc,
  author={Tingle Li and Yichen Liu and Chenxu Hu and Hang Zhao},
  title={{CVC: Contrastive Learning for Non-Parallel Voice Conversion}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1324--1328}
}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
checkpoints/vocoder		checkpoints/vocoder
data		data
datasets		datasets
experiments		experiments
figs		figs
models		models
options		options
util		util
LICENSE.md		LICENSE.md
README.md		README.md
compute_similarity.py		compute_similarity.py
environment.yaml		environment.yaml
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper

Prerequisites

Kick Start

CVC Training and Test

Baseline CycleGAN-VC Training and Test

Pre-trained CVC Model

TensorBoard Visualization

Citation

About

Uh oh!

Releases

Packages

Languages

License

Tinglok/CVC

Folders and files

Latest commit

History

Repository files navigation

Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper

Prerequisites

Kick Start

CVC Training and Test

Baseline CycleGAN-VC Training and Test

Pre-trained CVC Model

TensorBoard Visualization

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages