DocAttentionRect: Attention-Guided Document Image Rectification


Pooja Kumari (Indian Institute of Technology Madras), Sukhendu Das (Indian Institute of Technology Madras)
The 35th British Machine Vision Conference

Abstract

In recent years, document image rectification has seen substantial advancements. Nonetheless, current leading algorithms are primarily effective for images with clearly defined document boundaries and some degree of distortion. These algorithms often struggle when presented with images containing text in only a specific area or with incomplete boundaries, leading to subpar rectification results. This limitation is particularly problematic in situations where only sections of a document need processing. Although there are methods that attempt to address these issues, they frequently encounter difficulties when dealing with a combination of intricate distortions and diverse document layouts. To address this gap, our paper introduces a novel approach for document image rectification that specifically targets images with partial or missing document boundaries. Recently, attention-based neural networks have proven highly effective in enhancing the accuracy and efficiency of document rectification. By utilizing attention mechanisms, these networks can focus on relevant parts of an image, thereby improving the rectification outcomes. Our paper presents 'DocAttentionRect', an innovative attention-based rectification network that incorporates attention modules alongside parallel convolution layers to address complex document image rectification challenges. Our proposed architecture captures extensive dependencies and key textual and structural features throughout the rectification process. DocAttentionRect is capable of handling all document types, regardless of the visibility of their boundaries. Extensive experiments conducted on the DocUNet, DIR300, and UVDoc datasets demonstrate the superior performance and effectiveness of our proposed architecture.

Citation

@inproceedings{Kumari_2025_BMVC,
author    = {Pooja Kumari and Sukhendu Das},
title     = {DocAttentionRect: Attention-Guided Document Image Rectification},
booktitle = {36th British Machine Vision Conference 2025, {BMVC} 2025, Sheffield, UK, November 24-27, 2025},
publisher = {BMVA},
year      = {2025},
url       = {https://bmva-archive.org.uk/bmvc/2025/assets/papers/Paper_39/paper.pdf}
}


Copyright © 2025 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection