Skip to content

This repository is a customized version of DeepSpeech2 models for speech commands recognition implemented with PyTorch Lightning

Notifications You must be signed in to change notification settings

chnk58hoang/Smart-Room-Simulator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Customized Speech Recognition model for Command Recognition

Motivation

This project was developed for my Speech Processing course in my University. The model architecture was inspired by DeepSpeech2's architecture.

Requirements

I highly recommend using conda virtual environment. I implemented this model with Pytorch and Pytorch Lightning.

pip install -r requirements.txt

Dataset

The dataset used for training and evaluating this model was reccored and cleaned by me and my teamates. It contains 3800 wav files of 18 commands below:

Training

python train.py --epoch [num of epochs] --batch_size [batchsize] --data [path to image directory]  --vocab [path to vocab model file] --mode [decode mode: 'greedy' or 'beam'] 

Decoding

I used CTC as loss function. There are two strategies for decoding task, Greedy or BeamSearch decoder.

Inference

About

This repository is a customized version of DeepSpeech2 models for speech commands recognition implemented with PyTorch Lightning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages