DESCRIPTION:
This project processes scanned images of handwritten tables to automatically detect and recognize the tabular structure and content. It utilizes pretrained OCR models to accurately read handwritten entries and fill in an Excel sheet, mirroring the original layout. This project significantly reduces manual data entry effort and improves efficiency in handling handwritten documents.
(Note: This is a primary version, increasing the accuracy can be further worked upon)
ACCURACY, LIMITATIONS AND FUTURE SCOPE:
-
The current accuracy of the program is roughly 75-80%. It can be further improved by substituting the pretrained model used, with a model that is actively trained using the IAM dataset.
-
The current implementation of the code runs efficiently on powerful GPU machines in Google Colab but takes longer to execute on a local Jupyter notebook. As this is a preliminary version, future improvements can be made by optimizing and testing the code on a PC with a robust GPU to enhance performance.
-
There is one anamoly that needs to be worked upon that occurs when certain images are not cropped correctly, resulting in an error. This anomaly occurs when the table is unable to identify a cell in the table resulting in a NULL value being returned, which causes an error to spring up. (UPDATE: this error has been temporarily handled (in TROCR_v2.ipynb) by replacing the unrecognized text with a blank “ ” to avoid the program from abruptly stopping, and ensure its smooth completion)
UPDATES: The final version ready for use is the TROCR_v4.ipynb file.
- This program is now capable of handling pdf files as well as image (.jpg, .png, .jpeg) files. It automatically detects the file type and performs the respective procedure.
- The images extracted from the pdf file are also auto-cropped, to increase the clarity and accuracy.
- The images obtained are now also preprocessed to increase the accuracy even further.
CREDIT: Pretrained model used: TROCR Table detection model used: IMG2TABLE