Replies: 12 comments 15 replies
-
Thanks, it's on my radar. Is it finished? I remember when I first looked at it, it was still not in a usable stage. I might be able to work on it next month. |
Beta Was this translation helpful? Give feedback.
-
I came here to ask this. |
Beta Was this translation helpful? Give feedback.
-
It works but has some high memory requirements for training. It didn't split by token limit either so some limiting was needed. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the insight! Quick question that I haven't fully understood
about styletts - how often would you need to be training it?
…On Fri, Dec 1, 2023, 10:26 PM 78Alpha ***@***.***> wrote:
It works but has some high memory requirements for training.
It didn't split by token limit either so some limiting was needed.
—
Reply to this email directly, view it on GitHub
<#212 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTRXIZGXLJQUZFI63MSMFLYHHSJXAVCNFSM6AAAAAA7DUIJB2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TOMZQGA4DG>
.
You are receiving this because you commented.Message ID:
***@***.***
.com>
|
Beta Was this translation helpful? Give feedback.
-
You have to tune each voice
On Fri, Dec 1, 2023, 6:52 AM Roberts Slisans ***@***.***>
wrote:
… Thanks for the insight! Quick question that I haven't fully understood
about styletts - how often would you need to be training it?
On Fri, Dec 1, 2023, 10:26 PM 78Alpha ***@***.***> wrote:
> It works but has some high memory requirements for training.
>
> It didn't split by token limit either so some limiting was needed.
>
> —
> Reply to this email directly, view it on GitHub
> <
#212 (comment)>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ABTRXIZGXLJQUZFI63MSMFLYHHSJXAVCNFSM6AAAAAA7DUIJB2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TOMZQGA4DG>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***
> .com>
>
—
Reply to this email directly, view it on GitHub
<#212 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIQ4BKDZOTQC66XQLXKJGLYHHVKXAVCNFSM6AAAAAA7DUIJB2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TOMZQGM3DO>
.
You are receiving this because you commented.Message ID:
***@***.***
.com>
|
Beta Was this translation helpful? Give feedback.
-
That's a solid amount of VRAM. I'm guessing even for inference you'd need a
similar amount? In that case, it sounds like the way to go for most people
would be CPU inference. And it does sound like a miss that even with
optimizations it's the same size.
…On Sat, Dec 2, 2023, 10:02 PM 78Alpha ***@***.***> wrote:
As it is, there's a concept model but no real base model. It would be
training a base from scratch and then fine tuning after that. I only did
the fine tuning, but it still used over 24 GB VRAM at batch size 2 (it was
strongly recommended to never use 1).
It stars at 19 GB but then grows when the additional features kick in.
They added accelerate, but even with mixed precision, it was still 19.
Maybe just a miss somewhere?
Inference is fast though. the longest part was loading the models.
—
Reply to this email directly, view it on GitHub
<#212 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTRXI5P4DIPVOFAGSIJQZTYHMYHBAVCNFSM6AAAAAA7DUIJB2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TOMZYGE2TO>
.
You are receiving this because you commented.Message ID:
***@***.***
.com>
|
Beta Was this translation helpful? Give feedback.
-
Was just going through the repo - it has unusually high requirements - there are some VRAM and usage tips given in this ticket. |
Beta Was this translation helpful? Give feedback.
-
Interesting. It sounds like it could be workable. In this case, we'd need
at least a notebook for training, maybe some more integrations to make it
easy to rent a GPU for the training. My personal workstation has only 8GB,
so I'm looking at renting myself. Meanwhile, maybe styletts can optimize
memory.
…On Wed, Dec 6, 2023, 10:49 AM 78Alpha ***@***.***> wrote:
Inference only needs about 3 GB of VRAM. The training and need of training
is what kills the dream.
—
Reply to this email directly, view it on GitHub
<#212 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTRXI5IYEL6SSM7HWQPXU3YH7MLPAVCNFSM6AAAAAA7DUIJB2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TONZRGAZDM>
.
You are receiving this because you commented.Message ID:
***@***.***
.com>
|
Beta Was this translation helpful? Give feedback.
-
So here's what's putting a big freeze on it - software licences. This wasn't obvious before but now it is - StyleTTS2 "as demonstrated" relies on phonemizer which is GPL. Although there are discussions and ways to sidestep that, it hasn't been resolved. |
Beta Was this translation helpful? Give feedback.
-
I also came across this new Openvoice going around. Is the license feasible for integration in the WebUI? |
Beta Was this translation helpful? Give feedback.
-
@ehartford myshell-ai/OpenVoice#114 (comment) Edit: ah GitHub did the bug where quoting something doesn't make it a reply. |
Beta Was this translation helpful? Give feedback.
-
StyleTTS2 has been upgraded to a gradio demo. |
Beta Was this translation helpful? Give feedback.
-
I came across today thanks to @aedocw of epub2tts
https://github.com/yl4579/StyleTTS2
This looks promising. I wonder if we can integrate this into the WebUI in the future.
Beta Was this translation helpful? Give feedback.
All reactions