OCR Translator (Linux OS)

Keywords: OCR, Tesseract-OCR, Google Translate, Shell Script, Linux

1. Introduction: OCR Translator

Immigrants often struggle to understand letters in a foreign language received by mail. OCR Translator aims to overcome language barriers, by using Tesseract-OCR and Google Translate.

2. Workflow

notice: the preferred way is using a flatbed scanner, camera-based functionality will be added in future releases.

3. Config

Install Tesseract OCR; at time of writing, tesseract 4.0.0-beta.1 was used as OCR engine.
Install dependencies (using conda virtualenv)

    # navigate to ./anaconda 
    conda env create --file environment.yml
    
    # activate OCR_Translator_env
    source activate OCR_Translator_env

Notes:

currently supported data types: PDF, png
one page only (multiple pdf pages won't work)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

OCR Translator (Linux OS)

1. Introduction: OCR Translator

2. Workflow

3. Config

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

OCR Translator (Linux OS)

1. Introduction: OCR Translator

2. Workflow

3. Config

License