Repetitions and Hallucinations when using prompt feature #1992

vchagari · 2024-02-01T18:27:42Z

vchagari
Feb 1, 2024

Hi,

I see a lot of repetitions and hallucinations in the output when i use prompt feature, could you any one figured how to overcome this behaviour?, please let me know.

JH90iOS · 2024-04-26T02:23:22Z

JH90iOS
Apr 26, 2024

same , the prompt is helpful when transcribe some technical terms ,but will cause specific hallucinations and repetitions . Is there any way to avoid this problem ? Thanks a lot !

0 replies

rahulbansal16 · 2024-07-17T12:22:57Z

rahulbansal16
Jul 17, 2024

I am also experiencing them. How is deepgram able to handle it?

0 replies

rahulbansal16 · 2024-07-17T12:27:56Z

rahulbansal16
Jul 17, 2024

@vchagari @JH90iOS did you find any solution?

0 replies

toanhuynhnguyen · 2024-10-11T02:27:25Z

toanhuynhnguyen
Oct 11, 2024

I face the same issue, anyone can help? Thanks.

0 replies

blackpolarz · 2024-10-11T07:40:28Z

blackpolarz
Oct 11, 2024

Not sure about what others did but what I did was to postprocess the output with the following methods.

Check the output of each sentence, filter out common hallucinations terms. You can refer to hallucination discussion where users have pointed out some of the common hallucinated words such as subtitles, some website links and more.
Split the sentences, record them and compare them with the next sentence to determine if they are repeated.
As for prompts being shown as hallucination, refer to 1 with slight modification such that it only filters the entire prompt.

These methods would reduce some hallucination.
If you want better removal, you might want to use LLM to filter each output but that would be extremely computationally expensive.

1 reply

toanhuynhnguyen Oct 11, 2024

Thanks friend. But we need to fine-tune Whisper parameters for this issue.

mrfragger · 2024-12-18T23:31:12Z

mrfragger
Dec 18, 2024

I split audio up to 2,000 chunks to transcribe around 2m or 3 min chunks so if there is a repeat it's time is limited. Gonna use large-v2-q8_0 instead of large-v3-turbo as it's less prone to hallucinate. Even though turbo is 5x realtime speed compared to 3x realtime speed with large-v2-q8_0.

After stitch all 2,000 subs together into one sub then fix any overlapping timecodes with this code. Actually have it done automatically after each time I stitch them.

for f in *.vtt
do ffmpeg -y -i "$f" "${f%.*}"_overlapping.srt
done
for f in *_overlapping.srt
do awk '
  BEGIN {
    RS = "";
    OFS = FS = "\n";
    getline;
    n = split($0, prev_rec);
    split($2, prev_time, / --> /);
  }
  {
    split($2, a, / --> /);
    if (a[1] < prev_time[2])
      prev_rec[2] = prev_time[1]" --> "a[1];
    for (i=1;i<=n;i++)
      print prev_rec[i];
    printf("\n");
    n = split($0, prev_rec);
    split($2, prev_time, / --> /)
  }
  END {
    print
  }' "$f" > "${f%_overlapping.*}"_overlapfixed.srt
done
for f in *_overlapfixed.srt
do ffmpeg -y -i "$f" "${f%.*}".vtt
done 
for f in *_overlapping.srt
do git diff --no-index --word-diff=color --unified=5 --word-diff-regex=. "$f" "${f%_overlapping.*}"_overlapfixed.srt | aha --black -w -y 'font-size:1.3em' -t "Fixed Overlapping Timecodes" > "${f%_overlapping.*}"_diff.html
done
rm -f *_overlapping.srt
rm -f *_overlapfixed.srt
rm *.srt

Then I process for proper nouns any names, countries, cities, etc. that weren't capitalized. Have a huge list it goes through. Then finally check for any repeating words.

   dt=$( date +%Y_%m_%d_%H_%M_%S)
    [ ! -d output ] && mkdir output
    mkdir output/"$dt"
    cp *.vtt output/"$dt"
    cd output/"$dt"
    for f in *.vtt
do gsed -E \
-e "s|\bno no\b|no333no|g" \
-e "s|\bha ha\b|ha333ha|g" \
-e "s|\bher her\b|her333her|g" \
-e "s|\bbye bye\b|byebye|g" \
-e "s|\breally really\b|reallyreally|g" \
-e "s|\bstuff stuff\b|stuffstuff|g" \
-e "s|\bmany many\b|manymany|g" \
-e "s|\bvery very\b|veryvery|g" \
-e "s|\bhad had\b|hadhad|g" \
-e "s|\bblah blah\b|blahblah|g" \
-e "s|\bilai ilai\b|ilaiilai|g" \
-e "s|\bre re\b|rere|g" \
-e "s|\bthis this\b|thisthis|g" \
-e "s|\bthat that\b|thatthat|g" \
-e "s|\bever ever\b|everever|g" \
-e "s|\byou you\b|youyou|g" \
-e "s|\bso so\b|soso|g" \
-e "s|\bbeing being\b|beingbeing|g" \
-e "s|\b(\w+)\ \1\b|\1|g" \
-e "s|\b(\w+)\ \1\b|\1|g" \
-e "s|\b(\w+)\ \1\b|\1|g" \
-e "s|no333no|no no|g" \
-e "s|ha333ha|ha ha|g" \
-e "s|her333her|her her|g" \
-e "s|byebye|bye bye|g" \
-e "s|reallyreally|really really|g" \
-e "s|stuffstuff|stuff stuff|g" \
-e "s|manymany|many many|g" \
-e "s|veryvery|very very|g" \
-e "s|hadhad|had had|g" \
-e "s|blahblah|blah blah|g" \
-e "s|ilaiilai|ilai ilai|g" \
-e "s|rere|re re|g" \
-e "s|thisthis|this this|g" \
-e "s|thatthat|that that|g" \
-e "s|everever|ever ever|g" \
-e "s|youyou|you you|g" \
-e "s|soso|so so|g" \
-e "s|beingbeing|being being|g" \
 "$f" > "repeats_$f" ; \
git diff --no-index --word-diff=color --unified=2 --word-diff-regex=. "$f" "repeats_$f" \
| aha --black -t 'Repeating Words' -w -y 'font-size:1.3em' > "repeats_${f%.*}.html" ; \
rm "$f" ;  done

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repetitions and Hallucinations when using prompt feature #1992

{{title}}

Replies: 6 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Repetitions and Hallucinations when using prompt feature #1992

Replies: 6 comments · 1 reply

Replies: 6 comments 1 reply