Skip to content

yashagg2001/COVIDAnalysis

Repository files navigation

How to run this analysis in your local machine

1. Clone This Project: git clone https://github.com/yashagg2001/COVIDAnalysis.git
2. Go to Project Directory:cd COVIDAnalysis
3. Create a Virtual Environment: python -m venv covanalysis (for windows)
4. Activate Virtual Environment: covanalysis\Scripts\activate.bat (for windows)
5. Install Requirements Package: pip install -r requirements.txt
6. Run Analysis 1: python Analysis1.py
7. Run Analysis 2: python Analysis2.py
Now, to deactivate virtual environment: deactivate or covanalysis\Scripts\deactivate
Now, to delete virtual environment (simply delete folder covanalysis): rmdir covanalysis /s

COVIDAnalysis

In this project I analysed Coronavirus disease 2019 (COVID-19) data from various data sources including data from India and around the world using python libraries Pandas,Matplotlib and Numpy.

Analysis 1 (in analysis1.py): The data file Covid19IndiaData_30032020.xlsx presents the Indian patientlevel data until 30th March 2020. Source for latest Indian COVID-19 data: https://api.rootnet.in/.
(i) I calculated and plot the probability mass function (pmf) of the age of infected patients — this includes Hospitalized, Recovered and Dead. Then evaluated the expected age of an infected patient from this pmf and the variance of the pmf. Then gave the conclusion.
(ii) I calculated and plot the pmfs of the age of Recovered and Dead patients. Then calculated the expectation and variance of the pmfs.Then gave conclusion about COVID-19 by comparing the expectation values.
(iii) Calculated the conditional pmf of the age of all infected patients conditional to the gender of the patient. Then, I compare the expectations and gave comment of the possible reasons for any difference.

Analysis 2 (in analysis2.py): The data file linton_supp_tableS1_S2_8Feb2020.xlsx presents patient-level case data from China and other parts of the world. This includes the following information — Exposure date (E), Symptoms onset date (O), Hospitalisation date (H), and Date of death (X) in case of deceased patients. The data also includes whether the patient is/was a resident of Wuhan (China). Please note that details of surviving and deceased patients (until 31st January 2020) are included as different Sheets. The data can be downloaded from http://www.mdpi.com/2077-0383/9/2/538/s1
(i) I calculated and plot the pmf of the incubation period (the duration between the date of infection exposure (E) and the date of onset of symptoms (O)). Then, calculated the mean incubation period and the variance of the distribution.
Note: Use the left exposure data as the date of infection.
(ii) Calculated the expected incubation period by excluding Wuhan residents and compare the values with part (i) and then provided the comment based on this comparison.
(iii) Calculated the pmfs of the onset to hospitalization (H − O) for dead patients, onset to death (X − O) and hospitalization to death (X − H). Then, commented on the similarity in the distribution. Also compared the H − O pmf for surviving and dead patients and commented on the difference.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published