Deep Learning Approaches for OCR-Postprocessing

Deep Learning Approaches for OCR-Postprocessing

Affiliation: TU Darmstadt

Associated since: Dezember 2016

CEDIFOR Project Partners

CEDIFOR Partner

Short Description

Even most recent OCR methods rarely provide perfect results. Thus, there is need to post-process the texts that have been generated during OCR. The biggest challenge herein lies in developing an intelligent algorithm that corrects mistakes without causing additional errors in the text. Within the scope of this project in cooperation with ULB Darmstadt, deep learning techniques to post-process Optical Character Recognized documents will be tested.

Project Results

Publications:

  • Schnober, C., Eger, S., Do Dinh, E. & Gurevych, I. (2016). Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks. In Proceedings of the the 26th International Conference on Computational Linguistics (COLING), 1703-1714.