Local-global Data Augmentation for Contrastive Learning in Static Sign Language Recognition

Résultats de recherche: Contribution dans un livre/un catalogue/un rapport/dans les actes d'une conférenceArticle dans les actes d'une conférence/un colloque

Résumé

Sign language (SL) is a visual language used by the Deaf community. Static sign language recognition (SLR) consists of classifying static hand configurations, i.e., signs, present in isolated images. Due to the expertise required for manual annotation, SLR suffers from a data scarcity issue. Recent studies show that contrastive learning is an effective method for addressing this issue by proposing an efficient unsupervised pre-training. Contrastive learning leverages data augmentation techniques applied to entire images (global-global augmentation). However, fine-tuned, contrastive models often rely on irrelevant aspects of those images, like the background, without focusing solely on the regions of interest. Such models are prone to bias that could lead to unreliable predictions. In response, this paper proposes a new local-global data augmentation technique that helps contrastive models focus during the fine-tuning step on regions of interest, i.e., the signer’s hands. This approach (i) improves the accuracy of contrastive learning by up to 15% on some SLR datasets, and (ii) help the fine-tuned contrastive models to better focus on relevant regions of images for SLR.
langue originaleAnglais
titreIDA 2025
Sous-titre Intelligent Data Analysis
Etat de la publicationAccepté/sous presse - 2 févr. 2025

Empreinte digitale

Examiner les sujets de recherche de « Local-global Data Augmentation for Contrastive Learning in Static Sign Language Recognition ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation