PDF Translator is a Python-based tool designed to extract text from PDF documents, translate it into Sinhala using Google Translator, and save the translated content in a well-structured text file format. This tool is ideal for users who need to convert large volumes of PDF content into another language while preserving the structure of tables and pages.
- Text Extraction: Extracts text from PDF files while preserving layout information.
- Translation: Utilizes Google Translator for translating extracted text into Sinhala.
- Table Identification: Detects and formats tables from the PDF content.
- File Management: Saves translated content into text files, maintaining the structure of the original PDF.
git clone https://github.com/sithulaka/pdf-translator.git
cd pdf-translator
It's recommended to use a virtual environment to manage dependencies.
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
Install the required Python packages using the requirements.txt
file:
pip install -r requirements.txt
-
Place your PDF files in the
input_pdfs/
folder. -
Run the script to process the PDFs and generate translated text files:
python main.py
-
Check the output in the
output_texts/
folder. Each PDF will have a corresponding.txt
file with the translated content.
To translate a PDF named example.pdf
, place it in the input_pdfs/
folder and run:
python main.py
The translated text will be saved as example.txt
in the output_texts/
folder.
pdf_translator_project/
├── pdf_translator/
│ ├── __init__.py
│ ├── translator.py
├── input_pdfs/
├── output_texts/
├── requirements.txt
├── README.md
└── main.py
pdf_translator/
: Contains the core translation and extraction logic.input_pdfs/
: Folder for input PDF files.output_texts/
: Folder where the translated text files will be saved.requirements.txt
: Lists the dependencies required for the project.README.md
: Provides an overview and instructions for the project.main.py
: The entry point script for processing PDFs.
We welcome contributions to improve this project. To contribute:
- Fork the repository and create a new branch.
- Make your changes and test them thoroughly.
- Submit a pull request with a description of the changes.
Please ensure that your contributions adhere to the project's coding standards and include tests where applicable.
This project is licensed under the MIT License. See the LICENSE file for details.
Contributions are welcome! Please fork the repository and submit a pull request with your improvements or bug fixes.