Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers AI Whisper documentation needs clarification on acceptable audio formats #17916

Open
dharmab opened this issue Nov 1, 2024 · 0 comments
Assignees
Labels
content:edit Request for content edits documentation Documentation edits product:workers-ai Workers AI: https://developers.cloudflare.com/workers-ai/

Comments

@dharmab
Copy link

dharmab commented Nov 1, 2024

Existing documentation URL(s)

https://developers.cloudflare.com/workers-ai/models/whisper/#API%20Schemas

What changes are you suggesting?

  • One of the examples uses an MP3 file as binary data. No details are included on supported codecs, bitrates, or file sizes. Furthermore, because the example file is not included, the user cannot reproduce this example.
  • The API schema refers to "An array of integers that represent the audio data constrained to 8-bit unsigned integer values". This sounds like 8-bit linear PCM, but it's unclear. It's also unclear if the audio needs to be mono or if stereo is supported. Or is this an encoded format, like in the other MP3 example? There is no example for this schema, so the user cannot reproduce this example.

Additional information

No response

@dharmab dharmab added content:edit Request for content edits documentation Documentation edits labels Nov 1, 2024
@github-actions github-actions bot added the product:workers-ai Workers AI: https://developers.cloudflare.com/workers-ai/ label Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content:edit Request for content edits documentation Documentation edits product:workers-ai Workers AI: https://developers.cloudflare.com/workers-ai/
Projects
None yet
Development

No branches or pull requests

6 participants