DEAR PEOPLE FROM THE FUTURE: Here's what we've figured out so far...

Welcome! This is a Q&A website for computer programmers and users alike, focused on helping fellow programmers and users. Read more

What are you stuck on? Ask a question and hopefully somebody will be able to help you out!
0 votes

Since coqui-ai TTS isn't in the AUR I have to install it manually.

When I install it directly with

pip install TTS

it installs but at the end of the installation I get the error

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
typer 0.3.2 requires click<7.2.0,>=7.1.1, but you have click 8.1.3 which is incompatible.
google-api-core 2.10.0 requires protobuf<5.0.0dev,>=3.20.1, but you have protobuf 3.19.6 which is incompatible.

To avoid this error I think I should install it in a virtual environment, but I want to be able to use it like

ttst text wav

using a function like

ttst() {
  cat $1 | sentences | xargs -0 tts --model_name "tts_models/en/ljspeech/tacotron2-DDC"  --out_path "${2:-out.wav}"

and I don't know how to do that if I install it in a virtual environment.

If there is a simpler way just forget what I said. What I want to know is the answer to the title.

sentences refers to the sentences-bin package, required to tokenize the text in sentences, as coqui TTS only works on sentences.


1 Answer

0 votes
Best answer

I can use pipx for this.

python3 -m pip install --user pipx
python3 -m pipx ensurepath
pipx install TTS

Long and short sentences give errors. For long sentences max_decoder_steps: 20000 can be added to /home/user/.local/share/tts/tts_models--en--ljspeech--tacotron2-DDC/config.json. Short sentences need to be removed before calling the model. Or alternatively use a script to convert each sentence on it's own and then concatenate all the audio outputs. Like sapo or this one

edited by
Contributions licensed under CC0