Date of Award

16-4-2024

Document Type

Thesis

School

School of Computing

Programme

Ph.D.-Doctoral of Philosophy

First Advisor

Dr.R.Elakkiya

Keywords

Deep Learning, Sign Language, Video Generation, Geneative Adversarial Networks, Neural Machine Translation

Abstract

This dissertation presents a deep neural network based sign language video generation framework for translating the multilingual sentences into sign videos. This thesis addresses the challenges persist with the sign language video generation such as (i) Handling longer sequences of input sentences and new words (ii) Pose estimation with higher accuracy (iii) High quality photo realistic sign gesture video generation (iv) Improving realism in sign video generation. Hence, the thesis focuses four contributions to address the above issues.

The first contribution of this thesis automates the translation of multilingual sentences into sign glosses without manual intervention by incorporating Hybrid Neural Machine Translation and Attention Mechanism. To handle the issues, deep stacked GRU approach is introduced, and attention mechanism is incorporated for producing accurate translation results.

The second contribution develops a Dynamic GAN framework to generate cost effective photo-realistic high quality sign videos to serve the impaired community. To generate sign videos, conditional GAN approach is introduced and incorporation of pixel normalization, de-blurring and video completion approaches further facilitates high quality video generation.

The third contribution develops the end to end framework for sign gesture synthesis to attain high realism by combining the basic NLP techniques for translating the sentences into sign glosses. The proposed VidGenGAN model generates sign videos using deep stacked GRU approaches.

Finally, this thesis has thoroughly assessed and conducted both subjective and quantitative experiments using real-time signing videos obtained from various corpora and diverse sign language datasets such as RWTH-PHOENIX-Weather 2014T dataset for German Sign Language, and self-created dataset ISL-CSLTR for Indian Sign Language and How2Sign Dataset for American Sign Language. Also, it is proved that the system achieves plausible results over video generation tasks and produces high quality sign videos from the spoken language sentences.

Recommended Citation

B, Natarajan Mr, "Development of Deep Neural Architecture for Continuous Sign Language Video Generation" (2024). Theses and Dissertations. 63.
https://knowledgeconnect.sastra.edu/theses/63

Download

Included in

Other Computer Engineering Commons

COinS

Theses and Dissertations

Development of Deep Neural Architecture for Continuous Sign Language Video Generation

Date of Award

Document Type

School

Programme

First Advisor

Keywords

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Theses and Dissertations

Development of Deep Neural Architecture for Continuous Sign Language Video Generation

Author

Date of Award

Document Type

School

Programme

First Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner