DCTable: A Dilated CNN with Optimizing Anchors for Accurate Table Detection

Takwa Kazdar; Wided Souidene Mseddi; Moulay A Akhloufi; Ala Agrebi; Marwa Jmal; Rabah Attia

doi:10.3390/jimaging9030062

DCTable: A Dilated CNN with Optimizing Anchors for Accurate Table Detection

J Imaging. 2023 Mar 7;9(3):62. doi: 10.3390/jimaging9030062.

Authors

Takwa Kazdar¹, Wided Souidene Mseddi¹, Moulay A Akhloufi², Ala Agrebi¹, Marwa Jmal¹, Rabah Attia¹

Affiliations

¹ Sercom Laboratory, Ecole Polytechnique de Tunisie, Université de Carthage, La Marsa 2078, Tunisia.
² Perception, Robotics, and Intelligent Machines (PRIME), Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, Canada.

Abstract

With the widespread use of deep learning in leading systems, it has become the mainstream in the table detection field. Some tables are difficult to detect because of the likely figure layout or the small size. As a solution to the underlined problem, we propose a novel method, called DCTable, to improve Faster R-CNN for table detection. DCTable came up to extract more discriminative features using a backbone with dilated convolutions in order to improve the quality of region proposals. Another main contribution of this paper is the anchors optimization using the Intersection over Union (IoU)-balanced loss to train the RPN and reduce the false positive rate. This is followed by a RoI Align layer, instead of the ROI pooling, to improve the accuracy during mapping table proposal candidates by eliminating the coarse misalignment and introducing the bilinear interpolation in mapping region proposal candidates. Training and testing on a public dataset showed the effectiveness of the algorithm and a considerable improvement of the F1-score on ICDAR 2017-Pod, ICDAR-2019, Marmot and RVL CDIP datasets.

Keywords: Faster R-CNN; anchors; bilinear interpolation; dilated convolutions; table detection.

Grants and funding

RGPIN-2018-06233/Natural Sciences and Engineering Research Council of Canada (NSERC)