Project aimed at creating a public dataset for the COVID-19 vaccine batches sent to each of Brazil States by the Federal Government, between January to May 2021.
The vaccine batches information is made available by the Brazilian government agency SAGE in PDF files, one file for each vaccine distribution phase. The PDF files are for transport (freight) invoices, and they contain information about each vaccine batch (including its number and expiration dates). The picture below presents an example of such document.
The data extraction was originally done using python, imagemagick and pytesseract (see Jupyter Notebook). You can see the raw result here.
The following partial clean sets are available:
- For states Acre-AC, Alagoas-AL, Amazonas-AM only
- For the South region states - Paraná-PR, Rio Grande do Sul-RS, Santa Catarina-SC only
- For the Mid-west region states and Federal District - Distrito Federal-DF, Goiás-GO, Mato Grosso do Sul-MS, Mato Grosso-MT only
Current status (as of Aug 6th, 2021): cleaning the dataset to make it public. Feel free to contact me if you have any thoughts, suggestions or questions.
Original data source: https://sage.saude.gov.br/sistemas/vacina/vacina_fases.php