Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Faster-Whisper fails on no-speech audio due to language detection error #1208

Closed
le-luchina opened this issue Dec 18, 2024 · 1 comment

Comments

@le-luchina
Copy link

Description
The transcribe method fails with a ValueError (max() arg is an empty sequence) when processing audio files with no speech, especially after VAD filtering. This issue occurs during language detection when no segments remain after filtering.

Steps to Reproduce

  1. Use an audio file with no discernible speech or where VAD filters remove all speech.
  2. Call transcribe with vad_filter=True.
  3. Observe the following error:
    ValueError: max() arg is an empty sequence
    
    

Expected Behavior
The method should return a transcription object with empty or None fields instead of raising an error.

Environment
Faster-Whisper Version: 1.0.3
Python Version: 3.10
OS: Ubuntu 22.04
GPU

Suggested Fix
Add a check after VAD filtering to handle empty audio gracefully and return a formatted but empty result.

Error Traceback

File "/opt/src/audio/transcript_extractor.py", line 280, in _transcribe
transcription_segments, transcription_info = model.transcribe(
File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 419, in transcribe
language = max(…
ValueError: max() arg is an empty sequence

@MahmoudAshraf97
Copy link
Collaborator

This was solved in the latest version, please update and test again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants