This Project is inspired from this repository. Majority of the code is used from this repo.
For circuit design I have used these components:
1. ESP32
2. I2S OLED Display
3. SPH0645 (I2S) MEMS microphone
4. 1000uF capacitor
ESP32 firmware built using Platform.io. This runs the neural network trying to detect this set of words.
zero, one, two, three, four, five, six, seven, eight, nine, tree, bird, cat, dog, happy, house and wow.
The code takes audio input through SPH0645 Microphone and tries to detect detect if the word is in the list then prints the detected word with a percentage of how much sure it is about detected word.
Jupyter notebooks for creating a TensorFlow Lite model for "speech word" recognition.
A pre-trained converted_model.tflite
model and a compressed model.cc
has been generated and also added to the ESP32 firmware folder.
For trainning this dataset is used. (16KH mono
)