Line segmentation for grayscale text images of Khmer palm leaf manuscripts

Dona Valy, Michel Verleysen, Kimheng Sok

Résultats de recherche: Contribution dans un livre/un catalogue/un rapport/dans les actes d'une conférenceArticle dans les actes d'une conférence/un colloque

Résumé

Text line segmentation is one of the most essential pre-processing steps in character recognition and document analysis. In ancient documents, a variety of deformations caused by aging produce noises which make the binarization process very challenging. Moreover, due to the irregular layout such as skewness and fluctuation of text lines, segmenting an ancient manuscript page into lines still remains an open problem to solve. In this paper, we propose a novel line segmentation scheme for grayscale images of Khmer ancient documents. First, a stroke width transform is applied to extract connected components from the document page. The number and medial positions of text lines are estimated using a modified piece-wise projection profile technique. Those positions are then modified adaptively according to the curvature of the actual text lines. Finally, a path finding approach is used to separate touching components and also to mark the boundary of the text lines. Experiments are conducted on a dataset of 110 pages of Khmer palm leaf manuscript images by comparing the robustness of the proposed approach with existing methods from the literature.

langue originaleAnglais
titreProceedings of the 7th International Conference on Image Processing Theory, Tools and Applications, IPTA 2017
EditeurInstitute of Electrical and Electronics Engineers Inc.
Pages1-6
Nombre de pages6
Volume2018-January
ISBN (Electronique)9781538618417
Les DOIs
Etat de la publicationPublié - 8 mars 2018
Modification externeOui
Evénement7th International Conference on Image Processing Theory, Tools and Applications, IPTA 2017 - Montreal, Canada
Durée: 28 nov. 20171 déc. 2017

Série de publications

Nom2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Une conférence

Une conférence7th International Conference on Image Processing Theory, Tools and Applications, IPTA 2017
Pays/TerritoireCanada
La villeMontreal
période28/11/171/12/17

Empreinte digitale

Examiner les sujets de recherche de « Line segmentation for grayscale text images of Khmer palm leaf manuscripts ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation