Skip to content

Commit

Permalink
Merge pull request #3405 from coqui-ai/studio_speakers
Browse files Browse the repository at this point in the history
Add studio speakers to open source XTTS!
  • Loading branch information
erogol authored Dec 12, 2023
2 parents 934b87b + 8e6a7cb commit 8c1a8b5
Show file tree
Hide file tree
Showing 18 changed files with 182 additions and 895 deletions.
53 changes: 0 additions & 53 deletions .github/workflows/api_tests.yml

This file was deleted.

52 changes: 0 additions & 52 deletions .github/workflows/zoo_tests_tortoise.yml

This file was deleted.

3 changes: 0 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,6 @@ test_zoo: ## run zoo tests.
inference_tests: ## run inference tests.
nose2 -F -v -B --with-coverage --coverage TTS tests.inference_tests

api_tests: ## run api tests.
nose2 -F -v -B --with-coverage --coverage TTS tests.api_tests

data_tests: ## run data tests.
nose2 -F -v -B --with-coverage --coverage TTS tests.data_tests

Expand Down
31 changes: 0 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@
- 📣 [🐶Bark](https://github.com/suno-ai/bark) is now available for inference with unconstrained voice cloning. [Docs](https://tts.readthedocs.io/en/dev/models/bark.html)
- 📣 You can use [~1100 Fairseq models](https://github.com/facebookresearch/fairseq/tree/main/examples/mms) with 🐸TTS.
- 📣 🐸TTS now supports 🐢Tortoise with faster inference. [Docs](https://tts.readthedocs.io/en/dev/models/tortoise.html)
- 📣 **Coqui Studio API** is landed on 🐸TTS. - [Example](https://github.com/coqui-ai/TTS/blob/dev/README.md#-python-api)
- 📣 [**Coqui Studio API**](https://docs.coqui.ai/docs) is live.
- 📣 Voice generation with prompts - **Prompt to Voice** - is live on [**Coqui Studio**](https://app.coqui.ai/auth/signin)!! - [Blog Post](https://coqui.ai/blog/tts/prompt-to-voice)
- 📣 Voice generation with fusion - **Voice fusion** - is live on [**Coqui Studio**](https://app.coqui.ai/auth/signin).
- 📣 Voice cloning is live on [**Coqui Studio**](https://app.coqui.ai/auth/signin).
Expand Down Expand Up @@ -253,29 +251,6 @@ tts.tts_with_vc_to_file(
)
```

#### Example using [🐸Coqui Studio](https://coqui.ai) voices.
You access all of your cloned voices and built-in speakers in [🐸Coqui Studio](https://coqui.ai).
To do this, you'll need an API token, which you can obtain from the [account page](https://coqui.ai/account).
After obtaining the API token, you'll need to configure the COQUI_STUDIO_TOKEN environment variable.

Once you have a valid API token in place, the studio speakers will be displayed as distinct models within the list.
These models will follow the naming convention `coqui_studio/en/<studio_speaker_name>/coqui_studio`

```python
# XTTS model
models = TTS(cs_api_model="XTTS").list_models()
# Init TTS with the target studio speaker
tts = TTS(model_name="coqui_studio/en/Torcull Diarmuid/coqui_studio", progress_bar=False)
# Run TTS
tts.tts_to_file(text="This is a test.", language="en", file_path=OUTPUT_PATH)

# V1 model
models = TTS(cs_api_model="V1").list_models()
# Run TTS with emotion and speed control
# Emotion control only works with V1 model
tts.tts_to_file(text="This is a test.", file_path=OUTPUT_PATH, emotion="Happy", speed=1.5)
```

#### Example text to speech using **Fairseq models in ~1100 languages** 🤯.
For Fairseq models, use the following name format: `tts_models/<lang-iso_code>/fairseq/vits`.
You can find the language ISO codes [here](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)
Expand Down Expand Up @@ -351,12 +326,6 @@ If you don't specify any models, then it uses LJSpeech based English model.
$ tts --text "Text for TTS" --pipe_out --out_path output/path/speech.wav | aplay
```

- Run TTS and define speed factor to use for 🐸Coqui Studio models, between 0.0 and 2.0:

```
$ tts --text "Text for TTS" --model_name "coqui_studio/<language>/<dataset>/<model_name>" --speed 1.2 --out_path output/path/speech.wav
```

- Run a TTS model with its default vocoder model:

```
Expand Down
9 changes: 5 additions & 4 deletions TTS/.models.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@
"multilingual": {
"multi-dataset": {
"xtts_v2": {
"description": "XTTS-v2.0.2 by Coqui with 16 languages.",
"description": "XTTS-v2.0.3 by Coqui with 17 languages.",
"hf_url": [
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/model.pth",
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/config.json",
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/vocab.json",
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/hash.md5"
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/hash.md5",
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/speakers_xtts.pth"
],
"model_hash": "10f92b55c512af7a8d39d650547a15a7",
"default_vocoder": null,
Expand Down Expand Up @@ -45,7 +46,7 @@
"hf_url": [
"https://coqui.gateway.scarf.sh/hf/bark/coarse_2.pt",
"https://coqui.gateway.scarf.sh/hf/bark/fine_2.pt",
"https://app.coqui.ai/tts_model/text_2.pt",
"https://coqui.gateway.scarf.sh/hf/text_2.pt",
"https://coqui.gateway.scarf.sh/hf/bark/config.json",
"https://coqui.gateway.scarf.sh/hf/bark/hubert.pt",
"https://coqui.gateway.scarf.sh/hf/bark/tokenizer.pth"
Expand Down Expand Up @@ -270,7 +271,7 @@
"tortoise-v2": {
"description": "Tortoise tts model https://github.com/neonbjb/tortoise-tts",
"github_rls_url": [
"https://app.coqui.ai/tts_model/autoregressive.pth",
"https://coqui.gateway.scarf.sh/v0.14.1_models/autoregressive.pth",
"https://coqui.gateway.scarf.sh/v0.14.1_models/clvp2.pth",
"https://coqui.gateway.scarf.sh/v0.14.1_models/cvvp.pth",
"https://coqui.gateway.scarf.sh/v0.14.1_models/diffusion_decoder.pth",
Expand Down
Loading

0 comments on commit 8c1a8b5

Please sign in to comment.