Skip to content

Convert UK Tier 2 & Tier 5 Work Sponsor list in PDF to XML structured file

Notifications You must be signed in to change notification settings

somdipdey/Convert_UK_Tier2_Tier5_SponsorPDF_To_XML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

Convert_UK_Tier2_Tier5_SponsorPDF_To_XML

Convert UK Tier 2 & Tier 5 Work Sponsor list in PDF to XML structured file

Download and execute the file: Create UK-Tier2-Tier5-SponsorList-PDF-To-XML.py in order to download, format & convert the sponsor list in PDF to XML.

There are few important dependecies that need to installed in the system or else the program won't execute properly.

Dependecies Required

PDFTOHTML

You need to install pdftohtml on the system.

It can be installed with the following command:

>> brew install pdftohtml 

This adds pdftohtml to your path.

Website of PDFTOHTML: http://pdftohtml.sourceforge.net

PDFtk

You also need to install PDFtk Server to format the PDF file properly so that it can be converted to structured XML format.

It can be sintalled from the following web link:

https://www.pdflabs.com/tools/pdftk-server/

Simply choose the version based on your operating system and install. The path for 'pdftk' will be automatically added.

Website of PDFtk: https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/

About

Convert UK Tier 2 & Tier 5 Work Sponsor list in PDF to XML structured file

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages