update the model for the android demo app #1673

snyderra · 2024-12-02T20:01:31Z

Trying to add domain specific words for use with android demo. I am using the attached docker file as the build machine.

After a docker build and docker run it will successfully run the ./compile-graph.sh followed by decode.sh

After copying in the files to the model in the android project and running, it crashes. Maybe I am not getting all the proper files. Seems that guides suggest only needing to update Gr.fst and HCLr.fst???

Looking through the other issues, I know that you recommend using the repos from alphacep, however when using those in the dockerfile it fails to compile.

I have also attached the file list from the compile and decode.
files.txt

Appreciate any help.

Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xb400007de7f00dfc in tid 21126 (Thread-4), pid 20501 (org.vosk.demo)
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A  Cmdline: org.vosk.demo
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A  pid: 20501, tid: 21126, name: Thread-4  >>> org.vosk.demo <<<
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #00 pc 0000000000673f9c  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libvosk.so (kaldi::nnet3::DecodableAmNnetLoopedOnline::LogLikelihood(int, int)+128)
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #01 pc 00000000004a4928  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libvosk.so (kaldi::LatticeIncrementalDecoderTpl<fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::decoder::BackpointerToken>::ProcessEmitting(kaldi::DecodableInterface*)+748)
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #02 pc 00000000004a68d8  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libvosk.so (kaldi::LatticeIncrementalDecoderTpl<fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::decoder::BackpointerToken>::AdvanceDecoding(kaldi::DecodableInterface*, int)+264)
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #03 pc 0000000000363850  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libvosk.so (Recognizer::AcceptWaveform(kaldi::Vector<float>&)+196)
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #04 pc 00000000003639a0  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libvosk.so (Recognizer::AcceptWaveform(short const*, int)+192)
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #05 pc 0000000000426be4  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libvosk.so (vosk_recognizer_accept_waveform_s+8)
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #06 pc 000000000001404c  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libjnidispatch.so
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #07 pc 0000000000010a18  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libjnidispatch.so
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #08 pc 0000000000007564  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libjnidispatch.so
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #09 pc 0000000000011030  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libjnidispatch.so
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #10 pc 00000000000141e0  /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/lib/arm64/libjnidispatch.so
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #16 pc 000000000012f4d4  [anon:dalvik-classes.dex extracted in memory from /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/base.apk] (org.vosk.Recognizer.acceptWaveForm+0)
2024-12-02 14:01:57.033 21133-21133 DEBUG                   pid-21133                            A        #21 pc 000000000012f9d4  [anon:dalvik-classes.dex extracted in memory from /data/app/~~ljabya80z0lgHJIqGjO3LQ==/org.vosk.demo-91ru3yhb4MjUvi2sk2dinQ==/base.apk] (org.vosk.android.SpeechService$RecognizerThread.run+0)
---------------------------- PROCESS ENDED (20501) for package org.vosk.demo ----------------------------

The text was updated successfully, but these errors were encountered:

nshmyrev · 2024-12-02T20:42:54Z

You seem to compile big model, do you try to update small model with it? Those are not compatible. Small model compilation package is in colab (see vosk-adaptation.ipynb file)

snyderra · 2024-12-02T20:51:13Z

I'll give the notebook a try. How could I take the output of above and make it work with the android demo?

nshmyrev · 2024-12-02T20:54:51Z

You put it inside vosk-model-en-us-0.22-lgraph

https://alphacephei.com/vosk/models/vosk-model-en-us-0.22-lgraph.zip

nshmyrev · 2024-12-02T20:55:10Z

It is also recommended to test on desktop with python first then play with the device

snyderra · 2024-12-02T21:14:16Z

Tried the notebook. Seems that the decode is failing. I also get the same ivectors warning on the Dockerfile version
vosk-adaptation.ipynb.zip

snyderra · 2024-12-02T21:22:27Z

Ah I See need a GPU collab

# ivector-extract-online2 --config=exp/ivectors_test/conf/ivector_extractor.conf ark:data_test/test_small/split4/1/spk2utt scp:data_test/test_small/split4/1/feats.scp ark:- | copy-feats --compress=true ark:- ark,scp:/content/kaldi/egs/ac/vosk-model-small-en-us-0.15-compile-colab/exp/ivectors_test/ivector_online.1.ark,/content/kaldi/egs/ac/vosk-model-small-en-us-0.15-compile-colab/exp/ivectors_test/ivector_online.1.scp 
# Started at Mon Dec  2 21:06:31 UTC 2024
#
copy-feats --compress=true ark:- ark,scp:/content/kaldi/egs/ac/vosk-model-small-en-us-0.15-compile-colab/exp/ivectors_test/ivector_online.1.ark,/content/kaldi/egs/ac/vosk-model-small-en-us-0.15-compile-colab/exp/ivectors_test/ivector_online.1.scp 
ivector-extract-online2: error while loading shared libraries: libcudart.so.11.0: cannot open shared object file: No such file or directory
LOG (copy-feats[5.5.1092~1-341d0]:main():copy-feats.cc:143) Copied 0 feature matrices.
# Accounting: time=0 threads=1
# Ended (code 1) at Mon Dec  2 21:06:31 UTC 2024, elapsed time 0 seconds

snyderra · 2024-12-02T21:42:24Z

Nope was using a t4 GPU. Seems that Colab now uses version 12

!lsb_release -a
!nvcc --version

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0

snyderra · 2024-12-02T22:04:31Z

Add this to the top of the notebook to fix cuda issue: !apt-get -y install cuda-11-7

nshmyrev · 2024-12-02T22:10:17Z

Thanks, we probably need to recompile kaldi to use cpu

homerjonathan · 2024-12-09T20:27:05Z

I am having exactly the same problem. We have built using vosk-model-small-en-us-0.15-compile-colab. Successfully added words to the dictionary and tested in Python. Where is works perfectly.

However we now also want it work inside a browser. So we have used npm install vosk-browser and whilst the default vosk-model-small-en-us-0.15 works.

Our newly created model updated with a few words does not. We have enabled the commented out code in compile-graph.sh to see if we can produce them. This does as earlier mentioned allow the Python version to work with the new words.

# Lookahead part goes OOM
utils/mkgraph_lookahead.sh \
         --self-loop-scale 1.0 data/lang \
         exp/tdnn data/en-us-mix.lm.gz exp/tdnn/lgraph

It does create the HCLr.fst and Gr.fst files. We move them over and the library in javascript fails with this message.

LOG (VoskAPI:ReadDataFiles():src/model.cc:213) Decoding params beam=13 max-active=7000 lattice-beam=6
LOG (VoskAPI:ReadDataFiles():src/model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10
LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
LOG (VoskAPI:ReadDataFiles():src/model.cc:266) Reading CMVN stats from /vosk/model_en_attempt6_zip/global_cmvn.stats
LOG (VoskAPI:ReadDataFiles():src/model.cc:282) Loading HCL and G from /vosk/model_en_attempt6_zip/HCLr.fst /vosk/model_en_attempt6_zip/Gr.fst
ERROR (VoskAPI:DecodableNnetLoopedOnlineBase():decodable-online-looped.cc:50) Ivector feature dimension mismatch: got -1 but network expects 30
Recognizer (id: bfade988-a276-4b9f-abe4-62ec3a4ed268): Could not be created due to: 32685504
Recognizer not ready, ignoring

I have attempted to copy in the ivectors but noted that they have not changed. I have also noted that hte HCLr.fst and Gr.fst are smaller in size.

I do have a Cuda machine so we can easily run it. But I haven't had any message saying one was required.

Hope you can help us! We love the project.

nshmyrev · 2024-12-09T22:25:23Z

LOG (VoskAPI:ReadDataFiles():src/model.cc:266) Reading CMVN stats from /vosk/model_en_attempt6_zip/global_cmvn.stats

This file should not be there. Please check files of the original model. Please delete old files from filesystem, they can break things.

homerjonathan · 2024-12-10T20:57:30Z

Thanks for your suggestion. I have removed global_cmvn.stats. I don't have in the logs "Reading CMVN stats." But it still does not seem to work. Using the vosk-model-small-en-us-0.15.zip I have tried to rebuilt the same format. We must be close as the model works in Python. This is based on the vosk-model-small-en-us-0.15-compile-colab.tar.gz - This file is dated 17th August 2022 is there an updated version?

nshmyrev · 2024-12-11T23:17:00Z

This file is dated 17th August 2022 is there an updated version?

It is the latest version and it should work fine. You need to provide an updated log and show the model files on your filesystem.

homerjonathan · 2024-12-13T08:47:09Z

Here are the logs as requested.

compile_graph.log
decode.log

File structure is this:

What is confusing is that Gr.fst and HCLr.fst are smaller than the one compiled. Even if we change nothing it seems to build smaller.

nshmyrev · 2024-12-13T08:54:32Z

What is confusing is that Gr.fst and HCLr.fst are smaller than the one compiled. Even if we change nothing it seems to build smaller.

it is ok

File structure is this:

Ivector folder misses some file (final.ie for example)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update the model for the android demo app #1673

update the model for the android demo app #1673

snyderra commented Dec 2, 2024

nshmyrev commented Dec 2, 2024

snyderra commented Dec 2, 2024

nshmyrev commented Dec 2, 2024

nshmyrev commented Dec 2, 2024

snyderra commented Dec 2, 2024

snyderra commented Dec 2, 2024

snyderra commented Dec 2, 2024

snyderra commented Dec 2, 2024 •

edited

Loading

nshmyrev commented Dec 2, 2024

homerjonathan commented Dec 9, 2024

nshmyrev commented Dec 9, 2024

homerjonathan commented Dec 10, 2024 •

edited

Loading

nshmyrev commented Dec 11, 2024

homerjonathan commented Dec 13, 2024

nshmyrev commented Dec 13, 2024

update the model for the android demo app #1673

update the model for the android demo app #1673

Comments

snyderra commented Dec 2, 2024

nshmyrev commented Dec 2, 2024

snyderra commented Dec 2, 2024

nshmyrev commented Dec 2, 2024

nshmyrev commented Dec 2, 2024

snyderra commented Dec 2, 2024

snyderra commented Dec 2, 2024

snyderra commented Dec 2, 2024

snyderra commented Dec 2, 2024 • edited Loading

nshmyrev commented Dec 2, 2024

homerjonathan commented Dec 9, 2024

nshmyrev commented Dec 9, 2024

homerjonathan commented Dec 10, 2024 • edited Loading

nshmyrev commented Dec 11, 2024

homerjonathan commented Dec 13, 2024

nshmyrev commented Dec 13, 2024

snyderra commented Dec 2, 2024 •

edited

Loading

homerjonathan commented Dec 10, 2024 •

edited

Loading