Description

Nowadays more and more sign languages are documented by large scale
corpora, just as the Corpus LSFB for French Belgian Sign Language. These
corpora include a large amount of recorded videos from signers in
interaction as well as metadata about the tasks, the signers and the video
files. In addition, a part of the data is annotated with glosses for the signs
that are aligned on the videos, i.e. each lemma is identified in the signing
flow.
Exploiting these data remains fully dependent on the prior manual
annotation, from the glossing to the analytic annotation (at the level of discourse, syntax, morphology and phonology). This work is timeconsuming
and expensive, which sometimes impedes the exploitation of the
data, and thus the development of sign language linguistics in general.
Linguists are in demand of new methodologies that allow to detect
interesting data from the corpora in a quicker way. As a matter of fact, the
recent 7th Workshop on the Representation and Processing of Sign
languages (within the 10th edition of the LREC conference) was focused on
corpus mining.
The research project presented here aims at exploiting the bilingual
dimension of sign language corpora (among which the Corpus LSFB) with a
threefold objective: (i) to develop a methodology to detect linguistic
specificities of sign languages via their translation into spoken language; (ii)
to provide teachers, learners, interpreters and translators with linguistic
resources that can support their teaching/acquisition of French, of LSFB
and/or strengthen their bilingual skills; and (iii) to enable the development of
an automated assistance to the annotation of sign language data.
StatusNot started