We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I'm trying to train a model with a fine tuned british accent. I could only gather about 3 minutes and 20 seconds of audio for the dataset. Here's the input file. https://vocaroo.com/1gY2BK8MTiTF Here's the reference file. https://vocaroo.com/1bTycJauDp9i Output file from a pretrained model https://vocaroo.com/1cdag3gy5958 Output file from a fine tuned model https://vocaroo.com/1fvqjU7YBGrI
Edit: Nevermind I got the fine tuned model to work by following the instructions from this link #131 (comment)
In the config_dit_mel_seed_uvit_whisper_small_wavenet.yml config file, I changed the in_channels info from 768 to 1280
Then I replace the whisper-small info with whisper-large-v3-turbo. I suggest using whisper-large-v3 if you're having trouble.
Here is a output files with the improved fine tuned model https://vocaroo.com/17qPajlqbHnK
The text was updated successfully, but these errors were encountered:
No branches or pull requests
I'm trying to train a model with a fine tuned british accent. I could only gather about 3 minutes and 20 seconds of audio for the dataset. Here's the input file.
https://vocaroo.com/1gY2BK8MTiTF
Here's the reference file.
https://vocaroo.com/1bTycJauDp9i
Output file from a pretrained model
https://vocaroo.com/1cdag3gy5958
Output file from a fine tuned model
https://vocaroo.com/1fvqjU7YBGrI
Edit: Nevermind I got the fine tuned model to work by following the instructions from this link
#131 (comment)
In the config_dit_mel_seed_uvit_whisper_small_wavenet.yml config file, I changed the in_channels info from 768 to 1280
Then I replace the whisper-small info with whisper-large-v3-turbo. I suggest using whisper-large-v3 if you're having trouble.
Here is a output files with the improved fine tuned model
https://vocaroo.com/17qPajlqbHnK
The text was updated successfully, but these errors were encountered: