Release v0.7.0 · argmaxinc/WhisperKit

This is a very exciting release because we're seeing yet another massive speedup in offline throughput thanks to VAD based chunking 🚀

Highlights

Energy VAD based chunking 🗣️ @jkrukowski
- There is a new decoding option called chunkingStrategy which can significantly speed up your single file transcriptions with minimal WER downsides.
- It works by finding a clip point in the middle of the longest silence (lowest audio energy) in the last 15s of a 30s window and uses that to split up all the audio ahead of time so it can be asynchronously decoded in parallel.
- Heres a video of it in action, comparing .none chunking strategy with .vad

vad.chunking.mp4

Detect language helper:
- You can now call detectLanguage with just an audio path as input from the main whisperKit object. This will return a simple language code and probability back as a tuple, and has minimal logging/timing.
- Example:

let whisperKit = try await WhisperKit()
let (language, probs) = try await whisperKit.detectLanguage(audioPath: "your/audio/path/spanish.wav")
print(language) // "es"

WhisperKit via Expo @seb-sep
- For anyone that's been wanting to use WhisperKit in react native, @seb-sep is maintaining a repo that makes it easy, and also setup an automation that will automatically update it with each new WhisperKit release, check it out here: https://github.com/seb-sep/whisper-kit-expo
Bug fixes and enhancements:
- @jiangdi0924 and @fengcunhan contributed some nice fixes in this release with #136 and #138 (see below)
- Also moved the decoding progress callback to be fully async so that it doesn't block the decoder thread

What's Changed

Fix language detection by @jkrukowski in #133
Fix the reset operation exception in transcribeFile in the Demo. by @jiangdi0924 in #136
gh action for making pr to whisper-kit-expo on whisperkit release by @seb-sep in #137
add reStartRecordingLive function by @fengcunhan in #138
Added @_disfavoredOverload for deprecated methods by @jkrukowski in #143
VAD audio chunking by @jkrukowski in #135
Async Progress Callback by @ZachNagengast in #145
Detect language helper by @ZachNagengast in #146

New Contributors

@jiangdi0924 made their first contribution in #136
@seb-sep made their first contribution in #137
@fengcunhan made their first contribution in #138

Full Changelog: v0.6.1...v0.7.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.0

Highlights

What's Changed

New Contributors

Contributors