Table structure understanding and its performance evaluation

作者:

Highlights:

摘要

This paper presents a table structure understanding algorithm designed using optimization methods. The algorithm is probability based, where the probabilities are estimated from geometric measurements made on the various entities in a large training set. The methodology includes a global parameter optimization scheme, a novel automatic table ground truth generation system and a table structure understanding performance evaluation protocol. With a document data set having 518 table and 10,934 cell entities, it performed at the 96.76% accuracy rate on the cell level and 98.32% accuracy rate on the table level.

论文关键词:Pattern recognition,Document image analysis,Document layout analysis,Table structure understanding,Performance evaluation,Non-parametric statistical modeling,Optimization

论文评审过程:Received 9 September 2003, Revised 26 January 2004, Accepted 26 January 2004, Available online 2 April 2004.

论文官网地址:https://doi.org/10.1016/j.patcog.2004.01.012