Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI APIs for TTS/STT? #348

Open
skorokithakis opened this issue Dec 20, 2023 · 3 comments
Open

OpenAI APIs for TTS/STT? #348

skorokithakis opened this issue Dec 20, 2023 · 3 comments

Comments

@skorokithakis
Copy link

Is there (a plan for) a way to use the OpenAI servers for STT/TTS? They are fairly slow, unfortunately, but they might be a good option for some people.

@kristiankielhofner
Copy link
Contributor

It's not exactly impossible but it hasn't been a focus because as you say it's quite slow - to the point of going against our mission of an Alexa-competitive voice interface.

Willow has a fairly unique streaming method to WIS. I'm not completely familiar with the OpenAI speech API but at best you'd almost certainly need a proxy of some sort, and if you were doing advanced things like audio compression (AMR) you'd need to do more.

@skorokithakis
Copy link
Author

Makes sense, thank you.

@skorokithakis
Copy link
Author

I'd like to revisit this now with GPT-4o being out, the multimodal functionality of sending the audio directly to the model and getting audio back might be interesting. Are there any plans for WIS to send the audio to the REST endpoint directly, and receive audio back?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants