-
Notifications
You must be signed in to change notification settings - Fork 0
ardian/ASPxtraktor
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Dependencies that must be installed to run the software ==== Ubuntu ===== sudo apt-get install libtry-tiny-perl sudo apt-get install libwww-mechanize-perl sudo apt-get install libyaml-perl sudo apt-get install libhtml-treebuilder-xpath-perl sudo apt-get install libdbix-class-schema-loader-perl sudo apt-get install libcompress-bzip2-perl Test it like this : perl -I lib/ bin/aspxtraktor.pl --term "software" That only processes index pages and saves them but does not download the details. Detail Pages are processed like this : perl -I lib/ bin/aspxtraktor.pl --term "softwa" --recurse Read in a file into the database like this : perl -I lib/ bin/aspxtraktor.pl --file=output_test/DataExtractor_IPKO_P1_Data4.htm.bz2 if you want to load the business activity types : add the --loadtype arguement perl -I lib/ bin/aspxtraktor.pl --loadtype --file=output_test/DataExtractor_IPKO_P1_Data4.htm.bz2
About
Scraper in Perl
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published