Skip to content

carlosug/COVID-19-Analytics

Repository files navigation

Coronavirus disease (COVID-19) data analysis Worldwide

(last update: 3 April 2020)

Ongoing data science pipeline to process, analyse and visualise COVID-19 pandemia data. The intented goal is to illustrate with data cleaning, processing and visualisation pipelines, the most update packages and libraries for doing Data Science with R

corona_artwork.jpg

The current outbreak of coronavirus disease (COVID-19) that was first reported from Wuhan, China, on 31 December 2019.

Since mid-february, Johns Hopkins CSSE reports the number of diagnoses with the coronavirus and their residence on a daily base. The data contains the total number of positively tested (confirmed), deaths and recovery patients. ~The raw dataset can be found on the github repository. This dataset is daily updated.

This is a developing story ❗ Daily updates ❗

Datasets:

Tips for improvements

Suggest on this project is appreciated. I am looking for new features for the data pipelines.

See data science pipeline for technical details regarding data collection and cleaning.

📈 Graphs globally

The following graphs show the development of Coronavirus consequences on a daily basis. The outputs are updated on an daily basis and are generated automatically.

outputs/output_8_1.png

outputs/output_10_0.png

outputs/output_10_1.png

outputs/output_11_2.png

outputs/output_13_2.png

outputs/output_13_1.png

More visualisation outputs

TOP 20 Countries outputs/output_14_0.png

outputs/output_14_1.png

outputs/output_15_2.png

Death rate! outputs/output_15_1.png

outputs/output_16_1.png

Ranking Tables [Deprecated]

outputs/table1.png

Case by country

outputs/table2.png

Top 15

Reference and reproducible output

The pipeline is inspired by Yanchang Zhao. Citation:

COVID-19 Data Analysis with Tidyverse and Ggplot2 – Worldwide. RDataMining.com, 2020.

Output for Jupyter COVID-Descriptives

PDF Format COVID-Descriptives

Contact

Please connect with me c.utrilla.guerrero@gmail.com