A unified scheme of text localization and structured data extraction for joint OCR and data mining —— big data and artificial intelligence area
Both text detection and structured data extraction are imperative in an optical character recognition (OCR) processing pipeline. Text detection, especially for indistinct, diverse, multi-language text regions, is one of the most challenging tasks in computer vision and has attracted increasing attention recently. Moreover, although there are some studies in data mining related to structured data extraction, it has not received its deserved attention as one of important steps in OCR. The previous methods for structural data extraction, including layout template-based, rule-based, and natural language processing (NLP)-based methods, usually leads to either inaccurate results or complex modules. In this paper, we integrate text detection and structured data extraction into a unified deep learning-based Image Text Extraction (ITE) scheme. Our ITE is an end-to-end trainable model and able to handle multi-scale and multi-lingual text in a single process. Experiments on large-scale real-world passport and medical receipt datasets have demonstrated the superiority of the proposed method in terms of both effectiveness and efficiency.
Authors：Yibin Ye ; Shenggao Zhu ; Jing Wang ; Qi Du ; Yezhang Yang ; Dandan Tu ; Lanjun Wang ; Jiebo Luo（Cloud BU, Huawei Technologies）