This project extracts financial data from Maybank statement to JSON or CSV.
- output it in either JSON and CSV formats.
- read folder extract data into single or individual files -The extracted data includes date, description, transaction amount, balance exactly as the statement.
Example of the JSON output:
[
{
"date": "01/01/2024",
"desc": "Deposit from client",
"trans": 50.0,
"bal": 1050.0
},
{
"date": "02/01/2024",
"desc": "Purchase - Office Supplies",
"trans": -20.0,
"bal": 1030.0
}
]
Example of the CSV output:
date,desc,trans,bal
01/01/2024,Deposit from client,50.00,1050.00
02/01/2024,Purchase - Office Supplies,-20.00,1030.00
Follow these steps to set up and run the project:
cd <project-folder>
Create a virtual environment:
python3 -m venv venv
- On Linux/macOS:
source venv/bin/activate
- On Windows:
venv\Scripts\activate
Install the required packages:
pip install -r requirements.txt
To execute the program, use:
python3 main.py --path=example.pdf --pwd=01Mar2000
For more details on usage:
python3 main.py --help
main.py [OPTIONS]
Option | Description |
---|---|
--path TEXT |
Path to the file or folder containing PDF statements. |
--pwd TEXT |
Password for PDF statements, assuming the same password for every file. |
--format TEXT |
Output file format, either csv or json . |
--print-summary BOOLEAN |
Print a summary of the account statement. |
--merge BOOLEAN |
Output only a single merged file. |
--help |
Show help information and exit. |
To process a PDF file with the password 01Mar2000
:
python3 main.py --path=example.pdf --pwd=01Mar2000
python3 main.py --path=statements-folder --pwd=01Mar2000 #example extracting all data from folder containing PDFs