ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction Results
SROIE Introduction:
Scanned receipts OCR is a process of recognizing text from scanned structured and semi-structured receipts, and invoices in general. On the other hand, extracting key texts from receipts and invoices and save the texts to structured documents can serve many applications and services, such as efficient archiving, fast indexing and document analytics. Scanned receipts OCR and information extraction (SROIE) play critical roles in streamlining document-intensive processes and office automation in many financial, accounting and taxation areas. However, SROIE also faces big challenges. With performance greatly boosted by recent breakthroughs in deep learning technologies in terms of accuracy and processing speed, OCR is becoming mature for many practical tasks (such as name card recognition, license plate recognition and hand-written text recognition). However, receipts OCR has much higher accuracy requirements than the general OCR tasks for many commercial applications. And SROIE becomes more challenging when the scanned receipts have low quality. Therefore, in the existing SROIE systems, human resources are still heavily used in SROIE. There is an urgent need to research and develop fast, efficient and robust SROIE systems to reduce and even eliminate manual work.
Task 1 – Scanned Receipt Text Localisation
Hmean Ranking Table
Rank Team Name Team Members Insititute Recall Precision Hmean
1 SCUT-DLVC-Lab-Refinement Jiapeng Wang*, Yan Li*, Tianwei Wang, Jiaxin Zhang, Yichao Huang, Canjie Luo, Kai Ding, Lianwen Jin (*equal contribution) South China University of Technology, INTSIG Information Co. Ltd 98.64% 98.53% 98.59%
2 Ping An Property & Casualty Insurance Company Xianbiao Qi, Ning Lu, Yuan Gao, Yihao Chen, Shaoqiong Chen, Wenwen Yu, Rong Xiao Ping An Property & Casualty Insurance Company 98.60% 98.40% 98.50%
3 H&H Lab HUST_VLRGROUP(Mengde Xu, Zhen Zhu, Hui Zhang, Mingkun Yang, Jiehua Yang) & HUAWEI_CLOUD_EI(Jing Wang, Yibin Ye, Shenggao Zhu, Dandan Tu) Huazhong University of Science and Technology & Huawei Technologies Co. Ltd Joint Laboratory 97.93% 97.95% 97.94%
4 Psenet_dcn Xiufeng Jiang – 96.62% 96.21% 96.42%
5 BOE_IOT_AIBD v5 BOE_IOT_AIBD – 95.95% 95.99% 95.97%
6 EM_ocr Hao Wu, Na He, Zhou Shen, Dan Meng, Qingfeng Wang – 95.85% 96.08% 95.97%
7 Clova OCR Seung Shin, Sungrae Park, Seonghyeon Kim, Jaeheung Surh, Junyeop Lee, Hwalsuk Lee Clova AI Research, NAVER Corp 96.04% 95.79% 95.92%
8 IFLYTEK-textDet_v3 IFLYTEK IFLYTEK 93.77% 95.89% 94.81%
9 A Single-Shot Model for Robust Text Localization Hanqin Wang, Jie Qin, Fan Zhu, Li Liu, and Ling Shao Inception Institute of Artificial Intelligence 93.93% 94.80% 94.37%
10 SituTech_OCR Kui Lyu, Tianhao Tang, Minghao Wang SituTech 93.81% 94.18% 94.00%
Task 2 – Scanned Receipt OCR
Hmean Ranking Table
Rank Team Name Team Members Insititute Recall Precision Hmean
1 H&H Lab HUST_VLRGROUP(Hui Zhang, Mingkun Yang, Mengde Xu, Zhen Zhu, Jiehua Yang) & HUAWEI_CLOUD_EI(Jing Wang, Yibin Ye, Shenggao Zhu, Dandan Tu) Huazhong University of Science and Technology & Huawei Technologies Co. Ltd Joint Laboratory 96.35% 96.52% 96.43%
2 HeReceipt-Ensemble Yichao Huang*, Tianwei Wang*, Jiaxin Zhang*, Yan Li, Jiapeng Wang, Canjie Luo, Kai Ding, Lianwen Jin (*equal contribution) INTSIG Information Co. Ltd, South China University of Technology 94.56% 95.10% 94.82%
3 Ping An Property & Casualty Insurance Company Xianbiao Qi, Yihao Chen, Shaoqiong Chen, Ning Lu, Yuan Gao, Wenwen Yu, Rong Xiao Ping An Property & Casualty Insurance Company 94.48% 94.86% 94.67%
4 CLOVA OCR Sungrae Park, Seung Shin, Seonghyeon Kim, Jaeheung Surh, Junyeop Lee, Hwalsuk Lee Clova AI Research, NAVER Corp 94.30% 94.88% 94.59%
5 SCUT-DLVC-Lab-Lexicon Tianwei Wang*, Jiaxin Zhang*, Yichao Huang*, Jiapeng Wang, Yan Li, Canjie Luo, Kai Ding, Lianwen Jin (*equal contribution) South China University of Technology, INTSIG Information Co. Ltd 94.18% 94.88% 94.53%
6 DenseNet-Attention Recognition PINGAN Tech PINGAN Tech 94.29% 94.58% 94.44%
7 CITlab Argus Text Recognition Tobias Grüning, Gundram Leifert, Jochen Zöllner, Tobias Strauß, Roger Labahn CITlab 93.55% 93.61% 93.58%
8 Unet followed by CRNN with CTC Roberto Lotufo, Ramon Pires – 88.58% 87.30% 87.93%
9 BOE_IOT_AIBD T2 V5 BOE_IOT_AIBD – 87.84% 86.66% 87.24%
10 CRNN after UNet Segmentation Roberto Lotufo, Ramon Pires, Israel Campiotti, Rubens Machado, Luis Serrano, Giovanni Garuffi – 85.77% 86.48% 86.12
Task 3 – Key Information Extraction from Scanned Receipts
Hmean Ranking Table
Rank Team Name Team Members Insititute Recall Precision Hmean
1 Ping An Property & Casualty Insurance Company Xianbiao Qi, Wenwen Yu, Ning Lu, Yihao Chen, Shaoqiong Chen, Yuan Gao, Rong Xiao Ping An Property & Casualty Insurance Company 90.49% 90.49% 90.49%
2 EAST det + Multi-class classification liyulin, v_huangju, xiequnyi, qinxiameng Baidu 89.70% 89.70% 89.70%
3 H&H Lab HUST_VLRGROUP(Hui Zhang, Mengde Xu, Mingkun Yang, Zhen Zhu, Jiehua Yang) & HUAWEI_CLOUD_EI(Jing Wang, Yibin Ye, Shenggao Zhu, Dandan Tu) Huazhong University of Science and Technology & Huawei Technologies Co. Ltd Joint Laboratory 89.63% 89.63% 89.63%
4 CLOVA OCR Sungrae Park, Seonghyeon Kim, Seung Shin, Jaeheung Surh, Junyeop Lee, Hwalsuk Lee Clova AI Research, NAVER Corp 89.05% 89.05% 89.05%
5 NiuBiHongHong Ge daye – 87.61% 87.61% 87.61%
6 HeReceipt-withoutRM Hanmin Duan, Zhiqin Lu, Yang Chang, Yan Li, Yichao Huang, Kai Ding INTSIG Information Co. Ltd, South China University of Technology 83.00% 83.24% 83.12%
7 BOE_IOT_AIBD_v3 BOE_IOT_AIBD – 82.71% 82.71% 82.71%
8 PATECH_CHENGDU_OCR JunKun Zhou, Ming Guan, ZhengNan Luo, MingTao Wang, YuBin Xiao, MingBin Hou – 81.70% 82.29% 82.00%
9 NER with spaCy model Roberto Lotufo, Giovani Garuffi, Israel Campiotti, Ramon Pires, Rubens Machado, Luis Serrano – 78.96% 79.02% 78.99%
10 CITlab Argus Information Extraction (positional & line features, enhanced gt) Tobias Strauß, Tobias Grüning, Gundram Leifert, Jochen Zöllner, Roger Labahn CITlab 77.38% 77.38% 77.38%
Result Link:
http://www.onlyou.com/sroie/index.html