Theses and Dissertations

Impaired Speech Recognition of Neurological Disorder Persons Using Machine Learning and Deep Learning Techniques

Vishnika Veni S Ms, SASTRA Deemed to be UniversityFollow

Author ORCID Identifier

https://orcid.org/0000-0001-9520-9584

Date of Award

4-11-2024

Document Type

Thesis

School

School of Computing

Programme

Ph.D.-Doctoral of Philosophy

First Advisor

Dr.B.Santhi

Keywords

Impaired Speech Recognition, Speech Assistive tool, Neurological disorder, Machine learning, Deep learning

Abstract

Speech Assistive Tools have emerged in recent years to support individuals with cognitive and neurological disorders in the field of assistive technology. People affected by neurological disorders such as autism, stroke, cerebral palsy, dysarthria, Parkinson’s disease, and brain injury often find it difficult to articulate desired sounds, resulting in impaired speech. As the population of impaired speakers continues to increase every year, there is a strong need to develop intelligent speech recognition systems for affected individuals. The primary objective of this research is to develop an Impaired Speech Recognition (ISR) system for the Tamil language. Word Recognition Accuracy (WRA) is used as the performance metric, and a new dataset called the Impaired Speech Corpus in Tamil is created using speech samples collected from individuals with varying neurological disorders and intelligibility levels.

The proposed ISR system incorporates a Deep Neural Network–Hidden Markov Model (DNN-HMM) framework trained using the Lattice Free Maximum Mutual Information (LF-MMI) approach for effective recognition of impaired Tamil speech. Training and testing samples are collected from speakers with high, medium, low, and very low intelligibility levels. The recognition performance is evaluated and compared with baseline approaches using two datasets: a 20-word acoustically similar word dataset and a 50-word Impaired Speech Corpus in Tamil.

To address noisy, incomplete, and severely degraded impaired speech samples, an Enhancement Generative Adversarial Network (EGAN) is proposed for waveform enhancement. This approach improves the quality of impaired speech utterances and leads to better recognition performance on both the Tamil impaired speech datasets and the Universal Access benchmark database. The enhanced speech signals contribute to improved robustness and accuracy in impaired speech recognition.

Learning compact and efficient representations for disordered speech is challenging due to limited availability of impaired speech data. To overcome this issue, a novel sequence-to-vector representation based on HMM state sequences (HMM-SS) is proposed. This compact representation performs effectively with small datasets and is evaluated using four datasets: 50 words from TORGO, 100 common words from UA-SPEECH, 50 help-seeking words, and 100 common words from the Tamil impaired speech corpus. The proposed approach consistently outperforms baseline HMM, DNN-HMM, and state-of-the-art methods.

Finally, self-supervised and spectrogram-based approaches are explored to further improve impaired speech recognition. A Self-Supervised Learning (SSL) based Wav2Word framework using the wav2vec 2.0 encoder is proposed and evaluated on Tamil and English impaired speech datasets, achieving superior performance over conventional methods. In addition, a Denoising Convolutional Autoencoder (DCAE) is introduced to enhance spectrogram representations prior to CNN-based recognition. The proposed DCAE approach achieves significant performance improvements, with a maximum Word Recognition Accuracy of 96.07% on the Impaired Speech Corpus in Tamil, demonstrating its effectiveness for rehabilitation-oriented assistive technologies.

Recommended Citation

S, Vishnika Veni Ms, "Impaired Speech Recognition of Neurological Disorder Persons Using Machine Learning and Deep Learning Techniques" (2024). Theses and Dissertations. 156.
https://knowledgeconnect.sastra.edu/theses/156

Download

Included in

Other Computer Engineering Commons

COinS

Theses and Dissertations

Impaired Speech Recognition of Neurological Disorder Persons Using Machine Learning and Deep Learning Techniques

Author ORCID Identifier

Date of Award

Document Type

School

Programme

First Advisor

Keywords

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Theses and Dissertations

Impaired Speech Recognition of Neurological Disorder Persons Using Machine Learning and Deep Learning Techniques

Author

Author ORCID Identifier

Date of Award

Document Type

School

Programme

First Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner