Voice Pathology Detection

Graduation project for Ain shams Scientific Computing general program

Introduction

Our graduation project, Voice Pathology Detection, is a program designed to assess the health of a person's voice and classify voice disorders using both Machine Learning (ML) and Deep Learning (DL) techniques.

The human voice is a powerful tool for communication, but voice disorders can significantly impact a person's ability to communicate effectively. Early detection of voice pathologies is crucial for timely intervention and appropriate treatment. Our project aims to provide a valuable tool for identifying voice disorders by analyzing audio recordings of voice samples.

We leveraged the power of ML and DL algorithms to develop a model capable of distinguishing between healthy and pathological voice characteristics. By utilizing these advanced techniques, we sought to create a reliable and accurate system that could aid in the early diagnosis of voice disorders, potentially improving the quality of life for individuals affected by such conditions.

Throughout the development process, we focused on creating an intuitive and user-friendly interface that allows users to upload audio samples for analysis. The system then performs a comprehensive assessment, providing valuable insights into the health of the voice and identifying any potential voice pathologies.

By combining the fields of ML and DL, our graduation project aims to contribute to the field of voice pathology detection, supporting healthcare professionals and individuals alike in understanding and addressing voice disorders.

Team Members

Team Members who contributed to this project:

Ahmed Osama
Abdalla Osama
Manal Gerges
Sara Mohammed
Adham Ahmed
Ahmed Khaled

System Architecture

Technologies Used

Python
Librosa
Keras
TensorFlow
Pyqt5
Sqlite3
NumPy
Pandas

Dataset

we utilized the Saarbrucken database (SVD) to train and evaluate the model. SVD is german database contains a collection of voice recordings from individuals with and without voice pathologies. It contains audio samples of various durations and different types of voice disorders. Recorded as Sessions

Session Details:

Vowels /i, a, u/ produced at normal, high, and low pitch;
Vowels /i, a, u/ with rising–falling pitch; and
Sentence “Guten Morgen, wie geht es Ihnen?” (“Good morning, how are you?”).

Preprocessing

Data Augmentation

Time Domain
- Time shift
- Add Noise
Frequency Domain
- Frequency mask
- Time mask

Feature Extraction

Mel Spectrogram

Binary Model Architecture

VGG19

Results

Classification Model Architecture

Multiple Experiments were used 1.Binary model with Softmax output Layer 2.One-vs-All classification approach 3.Using machine learning models 4.Transfer learning pre-trained transformer model (Highest Accuracy)

AST

Results

Interface Screenshots

Register Page

Recording Page

Demo

check out our project in action! Watch the video demonstration below:

sysVid.mov

Acknowledgements

Special thanks to Teaching staff of Ain shams university Prof. Dr. Hala Mosheir, T.A. Mohammed Essam, T.A. Rezk Mohammed for their guidance and support throughout this project. Additionally, we acknowledge the creators of the Saarbrucken database (SVD) for making their data publicly available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Voice Pathology Detection

Table of Contents

Introduction

Team Members

System Architecture

Technologies Used

Dataset

Session Details:

Preprocessing

Feature Extraction

Binary Model Architecture

Results

Classification Model Architecture

AST

Results

Interface Screenshots

Register Page

Recording Page

Demo

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Voice Pathology Detection

Table of Contents

Introduction

Team Members

System Architecture

Technologies Used

Dataset

Session Details:

Preprocessing

Feature Extraction

Binary Model Architecture

Results

Classification Model Architecture

AST

Results

Interface Screenshots

Register Page

Recording Page

Demo

Acknowledgements