Skip to content

sashalex007/Textract-Wave-Invoicing-Server

Repository files navigation

AUTOMATIC INVOICE GENERATION FROM PDF (AWS Textract/Wave)

Screenshot

I got tired of using docparser and zapier to automatically create my invoices. Not only is docparser overpriced and sometimes fails for no apparent reason, its also limited in functionality. So I dedicated a weekend to building my own automatic invoicing server with Nodejs. This application leverages AWS textract to convert PDF's (in my case its purchase orders) to text, and then create invoices automatically using Waveapps GraphQL based API. This solution is effectively free (Textract charges less than 1 cent per page). Completed invoices are then downloaded automatically to a folder of your choosing.

This application is built on Nodejs, therefore can be run locally or remotely.

I would have liked for this guide to be sleeker, but for this application to work you first need to get a bunch of data from the wave API, so be ready to uncomment and/or delete certain lines of code (I tell you which ones). You will need to code your own parsing logic as your PDF's will be different from mine. I will refactor this at some point.

Don't code? No problem. Hire me to build this for you. alexpokho@gmail.com

Guide:

More detailed steps in server.js

  1. Setup your AWS CLI and credentials (and setup access to S3 and Textract) https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html#getting-started-install-instructions

  2. Setup a waveapps account and create your products and customers.

  3. Create a new waveapp from the developer portal https://developer.waveapps.com/hc/en-us/sections/360003012132-Create-an-App

  4. Run npm install

  5. Open server.js and input your wave access token

    const waveAccessToken = 'XXXXX' //access token goes here

  6. Run node server.js, then copy your business id from the console and input your business id

    const waveBusinessID = 'XXXXX' //business id goes here

  7. Delete getBusiness() and uncomment //getData(), run node server.js

  8. Check your customer.js file and products.js file, they will be autogenerated. Each product and customer has an ID, we will need this later for creating invoices. Delete getData(), and uncomment //processInvoice(fileArray[processedFileCount]).

  9. Place a sample purchase order in the purchase order folder and run node server.js, wait for successful Textract result. If S3 upload (and deletion) and textract response was successful, you can now build a custom parser for the textract response.

  10. Uncomment //let invoiceData = parseTextractData(textractResult), and start building your parsing logic. You must provide data in the correct format for the invoice creation. Follow the code in the parseTextractData function to build out your logic. You will require customers.js and products.js to get the right ID's.

About

Automatic invoice creation from PDF (AWS Textract/Wave)

Resources

Stars

Watchers

Forks