Skip to content

The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"

License

Notifications You must be signed in to change notification settings

light1726/SpeechTripleNet

Repository files navigation

SpeechTripleNet

The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"

Environment setup

conda env create -f environment.yml
conda activate speechtriplenet-env

Quick try of speech editing with pretrained model

# Download the pretrained model from https://drive.google.com/file/d/1dAdPXtENtACtVokBZyzn32DlWd1Zk3Yy/view?usp=sharing;
# Put it under output-CCDPJ-c_100.0_1.3-s_10.0_60.0-p_10.0_3.0/ckpt/VCTK/
jupyter notebook speech_editing.ipynb

Data feature extraction for training

python preprocess.py --config ./configs/VCTK/preprocess.yaml

Training

CUDA_VISIBLE_DEVICES=0 python train.py --mdl CCDPJ -p ./configs/VCTK/preprocess.yaml -t ./configs/VCTK/train.yaml -m ./configs/VCTK/model.yaml

Inference

See speech_editing.ipynb.

About

The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published