Deep Learning Approaches for OCR-Postprocessing
Affiliation: TU Darmstadt
Associated since: Dezember 2016
CEDIFOR Project Partners
- Erik-Lân Do Dinh, UKP Lab, TU Darmstadt
- Dr. Steffen Eger, UKP Lab, TU Darmstadt
- Prof. Dr. Iryna Gurevych, UKP Lab, TU Darmstadt
CEDIFOR Partner
- Dr. Wolfgang Stille, ULB Darmstadt
Short Description
Even most recent OCR methods rarely provide perfect results. Thus, there is need to post-process the texts that have been generated during OCR. The biggest challenge herein lies in developing an intelligent algorithm that corrects mistakes without causing additional errors in the text. Within the scope of this project in cooperation with ULB Darmstadt, deep learning techniques to post-process Optical Character Recognized documents will be tested.
Project Results
Publications:
- Schnober, C., Eger, S., Do Dinh, E. & Gurevych, I. (2016). Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks. In Proceedings of the the 26th International Conference on Computational Linguistics (COLING), 1703-1714.