Skip to content

Latest commit

Β 

History

History
9 lines (8 loc) Β· 431 Bytes

README.md

File metadata and controls

9 lines (8 loc) Β· 431 Bytes

survey-report-scrapr

πŸ“„ ⛏ Break data out of PDF prison

This walkthrough demonstrates how to:

  • Scrape data from PDF tables using tabulizer
  • Manage unwieldy header types and tidy scraped data output using dplyr, tidyr, and stringr
  • Abstract steps into a scraper function
  • Iterate across multiple tables and PDFs with purrr
  • Reshape and bind output into a master tidy dataframe