This project aims to develop a low-power, real-time audio analysis embedded system to detect specific absolutist keywords, which can be used as markers for mental health language. The system will be designed using an Arduino Nano BLE Sense board equipped with a digital microphone, allowing for audio data collection, model training, and real-time keyword detection.
- Gather Audio Samples: Record audio samples for a set of given absolutist keywords.
- Expand Dataset: Integrate the gathered audio samples with the existing Speech Command dataset to create a comprehensive keyword spotting dataset.
- Feature Extraction: Extract relevant features from the audio data using techniques discussed in class (e.g., MFCCs, spectrograms).
- Model Training: Train a machine learning model to detect the absolutist keywords using the extracted features.
- Model Validation: Validate the model to ensure it accurately identifies the keywords.
- Deploy Model: Implement the trained model on the Arduino Nano BLE Sense board.
- Real-Time Testing: Test the system for real-time keyword spotting and evaluate its performance.
The dataset will consist of:
- Audio samples of absolutist keywords recorded specifically for this project.
-
Record Audio Samples:
- Use a recording device to collect audio samples of the following absolutist keywords: “all,” “must,” “never,” “none,”, “only”, and “silence”
- Save these samples in the
data/keywords
directory. - Open Speech Recording tool can be used to record audio signals.
- Speech Command dataset
-
Integrate Dataset:
- Combine the recorded samples with the Speech Command dataset in the
data
directory.
- Combine the recorded samples with the Speech Command dataset in the
Train the model in the cloud using Google Colaboratory or locally using a Jupyter Notebook.
Use model.ipynb
Google Colaboratory | Jupyter Notebook |
-
Convert Model:
- Convert the trained model to a format compatible with the Arduino board (e.g., TensorFlow Lite).
-
Deploy Model:
- Upload the model and the necessary code to the Arduino Nano BLE Sense board.
- Refer to micro_speech folder.
-
Real-Time Testing:
- Fetch testing audios from testing_audio folder.
- Test the system for real-time keyword spotting and evaluate its performance.
- There is an accuracy of 96% after training.
- In training unknown and silence words are also included apart from 5 words.
- The analysis of the confusion matrix revealed that the model exhibited high accuracy in recognizing certain keywords such as "all," "only," and "silence."
- However, its performance was comparatively weaker when identifying keywords like "must," "none," and "never."
For a visual demonstration of this project, please refer to the video linked below: