A Gtk/Qt front-end to tesseract-ocr.
-
Updated
Dec 14, 2024 - C++
A Gtk/Qt front-end to tesseract-ocr.
Document Layout Analysis resources repos for development with PdfPig.
Text-to-tibble
Python package for combining .hocr files and images into searchable PDFs
Python parser for hOCR files using lxml
Quick and dirty visualization of HOCR bboxes on a page
A sample code using tesseract-ocr .NET Core for optical character recognition. The result is formatted as HTML.
Add a description, image, and links to the hocr-documents topic page so that developers can more easily learn about it.
To associate your repository with the hocr-documents topic, visit your repo's landing page and select "manage topics."