-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WOW Great extension! The best TTS extension out there! Here are some code fixes for auto play and installation! #3
Comments
Thanks! I had to make a chance get get it to work right in linux for me. I changed: "controls autoplay style="height: 30px;">", "controls style="height: 30px;">") to: 'controls autoplay style="height: 30px;">', 'controls style="height: 30px;">' I used chatgpt to help me make the fix. It works for me, but I don't know how correct this change is. |
What that bit of code is doing is replacing the stings inside the log file and removing the "autoplay" tag. your code has the embeddings for the source location of the .wav files slightly different than the og barkTTS code if you look at your format_html function def format_html(audiofiles):
so we are making sure we are changing this from "controls autoplay style="height: 30px;">" to "controls style="height: 30px;">") in the history of the conversation with the AI so it doesn't keep autoplaying. |
I edited the .py file in my fork for you to reference if you need it: https://github.com/RandomInternetPreson/text_generation_webui_xtt_Alts/blob/main/script.py wow this works so incredibly well! |
Sorry to keep peppering you here in this issue, but just wanted to let you know that I'd be okay if you wanted to reference my fork here: https://github.com/RandomInternetPreson/text_generation_webui_xtt_Alts |
Ill close my other issue on here, but I can confirm that on a 100% fresh install of Text-Gen-WebUI on windows, I did the following: Run a command prompt cd back up to the text-generation-webui folder. Agree to the license and let it download the other files it needs. With all that done, its running fine! :) No audio repeats etc. One thing I do notice, it keeps the generated audio in \text-generation-webui\extensions\text_generation_webui_xtt_Alts\generated so that may need clean up from time to time. Im sure the changes will get merged back into the original on here at some point! Thanks for everyone's help and work on this! |
A quick note on speed vs quality etc as its not mentioned anywhere else. I notice the sample audio voice file used to generate audio, is about 7 seconds long, Mono (not stereo), PCM S16 LE with a Sample rate of 22050Hz and Bits per sample 16. I'm guessing there are a few factors that may speed up processing.
I tried a very simple test using a 22050Hz sample voice and a 44100Hz sample voice (9 second mono sample). 22050Hz > Processing time: 59.185802936553955 This was generating the same amount of speech. Its not highly scientific, run over 1000's tests. But it would appear that if you want to use your favourite celebrity voice, get a high quality sample, make it mono, drop its bit rate to 22050Hz and keep it around the 4-9 second mark. (I suspect a shorter voice sample probably will be faster). |
Followed the steps but it still gives me a ERROR:Failed to load the extension "text_generation_webui_xtt_Alts". When restarting the webui after activating it in the session tab |
If you are using windows follow these instructions, I've made a video to go with them. These instructions will show you how to install TTS. |
Thanks for your help!
I added an option to delete old files on startup in the config.json |
The model outputs 24khz mono files, so I presume that is the ideal format for samples as well. Could potentially write code to automatically resample the input files |
Yeass! You got the repo fixed up, thank you again for making this. It is one of the last missing pieces for AI interactions, the speed and quality is above everything else. |
Alright I got it to work! The problem was I installed TTS in textgen and not in the base environment |
As long as you have textgen activated when running the webui that shouldn't be an issue |
Firstly, thank you for taking the time to do this!!! OMG it's fast, does perfect inflections, this is eleven labs quality on my local machine AMAZING!!!!!
Here is some information to make the extension work a bit better, I'm on a windows machine so my experience might be unique to that.
def history_modifier(history):
if len(history["internal"]) > 0:
history["visible"][-1] = [
history["visible"][-1][0],
history["visible"][-1][1].replace(
"controls autoplay>", "controls>")
]
return history
to this:
def history_modifier(history):
if len(history["internal"]) > 0:
history["visible"][-1] = [
history["visible"][-1][0],
history["visible"][-1][1].replace(
"controls autoplay style="height: 30px;">", "controls style="height: 30px;">")
]
return history
text-generation-webui-xtts
to:
text_generation_webui_xtts
Seriously amazing stuff, thank you again for integrating this into oobabooga. I will do a pr just to have a copy to mess around with, but I'll direct people to this repo.
The text was updated successfully, but these errors were encountered: