You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there (a plan for) a way to use the OpenAI servers for STT/TTS? They are fairly slow, unfortunately, but they might be a good option for some people.
The text was updated successfully, but these errors were encountered:
It's not exactly impossible but it hasn't been a focus because as you say it's quite slow - to the point of going against our mission of an Alexa-competitive voice interface.
Willow has a fairly unique streaming method to WIS. I'm not completely familiar with the OpenAI speech API but at best you'd almost certainly need a proxy of some sort, and if you were doing advanced things like audio compression (AMR) you'd need to do more.
I'd like to revisit this now with GPT-4o being out, the multimodal functionality of sending the audio directly to the model and getting audio back might be interesting. Are there any plans for WIS to send the audio to the REST endpoint directly, and receive audio back?
Is there (a plan for) a way to use the OpenAI servers for STT/TTS? They are fairly slow, unfortunately, but they might be a good option for some people.
The text was updated successfully, but these errors were encountered: