top of page
  • Writer's picturecoursevita

DATA SCIENCE - The Unsung Hero of COVID-19 pandemic times

How Data Science transformed healthcare: A Comprehensive case study on AstraZeneca


It was in December 2019, the world had first heard about something that would be a part of their lives for the next five years. The following two years were a total wreck- flights banned, hospitals flooded, death tolls raised and many such mishaps. One science and one department that stood firm in those tough times all along- THE HEALTHCARE. And one science acted as a backbone in helping healthcare stand tall in such frail times. Yes, it is Data Science. From understanding, identifying to combating the virus in the fierce way possible, Data Science was the unsung hero in all these times. In this article by CourseVita, we will discuss the rampant spreading of Covid-19, and how healthcare and Data Science unite forces to fight against the world's most mutated virus.

Data collection and Analysis

Data collection was a vital aspect in determining the course of the pandemic as we could get an idea of how and to whom it actually was spreading. There are three types of data collection methods used in the pandemic times. They are-

  1. Epidemiological Data : This type of data collection helps the public health authorities to understand in what way the disease spreads in various populations. It also includes the info about who’s having symptoms and who are getting sick, what is their location and how the disease is being transmitted.

  2. Epidemiological data is collected by healthcare facilities in general. It consists of the data such as number of the cases registered for the day, symptomatic and asymptomatic cases, seriousness of the patients, number of households in each such cases and the number of deaths etc. during treatment.

  3. Clinical Data : This data helps to get an idea about how the disease actually is affecting people’s health. Also including the data like symptoms, medical conditions, complications and about how people are responding to the treatment.

This type of data is collected by medical workers generally by interview records with patients, collecting data with patient forms etc.

Genomic Data : This data helps scientists find out the genetic building and makeup of the virus. With analysing the data of the virus, scientists track how the virus is mutating over time and its spreading.

Genomic data helps in the development of COVID-19 vaccines by identifying target areas on the virus's DNA for vaccine development.

Understanding the genetic makeup of the virus allows scientists to develop vaccines that effectively target and neutralise it.

This type of data is collected by sequencing the genetic data, DNA of the virus and calculating its mutations.

Use of Data Science is used to predict the spread of virus:

  • Collecting Data: Gathering data from various sources.

  • Cleaning Data: Making sure the data is organic and organised.

  • Building Models: Analysing the data by creating mathematical models.

  • Making Predictions: Developing a forecast of how the virus would spread to make further precautions.

  • Informing Decisions: Helping healthcare organisations and governments make better decisions.

Predictive Modelling

Predictive modelling is a type of technique used in Data science to predict the future outcomes based on the collected data. In the context of pandemic, predictive modelling involves using data on factors such as rate of infection, demographics, capacity of healthcare, and intervention of public healthcare to anticipate the course of virus spread and getting an idea of how it might evolve over time.

By analysing the collected data, predictive modelling can give an projected outcome on how the disease is spreading in different demographics, identifying potential high risk areas, estimating the number of cases for a time and evaluating the best of interventions made by the healthcare respectively.

These methods are useful in authorities, policy makers and healthcare personnel in taking timely and efficient decisions to control the virus spread and mitigate its effects further.

Predictive Modelling in Vaccine distribution

Predictive modelling in vaccine distribution uses the following data science techniques to forecast and organise the allocation and distribution of vaccines.

  • Data Collection:

Collecting data on factors like demographics of population, healthcare infrastructure, supplying of vaccines, and distribution logistics.

  • Model Building:

Creating mathematical models that consider the factors affecting vaccine distribution, like population demographics, density and others.

  • Example of one such model is the SIR model. The SIR model divides the population into three different categories. They are Susceptible (S), Infected(I) and Recovered(R). This model assumes that the population is constant, people move from Susceptible to Infected to Recovered statuses.

  • The model’s parameters include transmission rate(beta), recovery rate(gamma) and the basic production number (RO).

In the context of COVID-19, the SIR model was used to predict the spread of the virus and assess the impact of public health interventions such as social distancing, mask-wearing, and vaccination.

  • Allocation Optimization:

Using those models to predict where vaccines will be needed the most and calculating how many doses are required.

Epidemiological data was continuously monitored in real-time to both track the spread of virus and also checking the effectiveness of vaccination campaigns.

Public health care authorities were able to predict the plans to ensure equity vaccination by comparing the data of predicted vaccination and actual vaccination data.

  • Logistics Planning:

Predicting the demand for vaccines in various areas and forecasting the time,cost and management of delivery to ensure a smooth and efficient vaccination process.

  • Monitoring and Adjustment:

Monitoring the data continuously and creating new models making them available to provide constant evolving and updation of adjustment data.

By using predictive modelling authorities can make sure to deliver vaccines timely, serving the most needy regions first, using appropriate doses and further minimising the burden on authorities and health care personnel by promising a tough fight in curbing the spread of virus.

A case study on AstraZeneca’s Covid-19 Vaccine

AstraZeneca has delivered up to 3 billion doses of COVID-19 vaccine across the globe by the end of 2021 - which was just 18 months after the company first partnered with the Department of Medicine, University of Oxford to develop and manufacture the vaccine.

In wake of the concerns surrounding potential rare side effects of the AstraZeneca-Oxford Covid-19 vaccine, the UK-headquartered pharmaceutical giant has reaffirmed its commitment to the safety of patients. Whilst there has always been a debate on the aftereffects of vaccines, we will read further how Data science was used by AstraZeneca in distribution of vaccines worldwide.

Integration of Data science in various stages of development and distribution of Covid19 vaccine is as follows:

  1. Clinical Trials Design:

Increase in clinical approval trials in India:

  • Data Analysis for Trial Design:

Data scientists examine epidemiological and clinical data to recognize target populations, determine sample sizes, and plan effective clinical trials.

  • A/B Testing: A/B testing is a method used in clinical trials to compare two or more versions of a treatment or intervention. Participants are randomly assigned to different groups, with one group receiving the treatment (A) and the other group receiving a control or placebo (B). This allows researchers to measure the effectiveness of the treatment and compare it to standard care or no treatment.

  • Randomised Controlled Trials (RCTs) :   Randomised Controlled Trials (RCTs) are a type of clinical trial in which participants are randomly assigned to receive either the treatment being tested or a control treatment. This random assignment helps to minimise bias and ensures that any differences in outcomes between the two groups are due to the treatment itself and not other factors. RCTs are considered the gold standard for evaluating the effectiveness of new treatments and interventions.

  • Statistical Analysis:

Statistical modelling is used to estimate antibody adequacy, evaluate safety profiles, and analyse trial results.

  • Machine Learning for Predictive Modeling:

Machine learning algorithms are used to recognize potential adverse events and foresee vaccine responses.

In predictive modelling for vaccine responses and adverse event identification, commonly used machine learning algorithms include:

  • Random Forest:

Builds multiple decision trees to make predictions, suitable for handling large and complex datasets.

  • Support Vector Machines (SVM):

Effective for both linear and nonlinear data, commonly used for classification tasks.

  • Logistic Regression:

Simple yet powerful algorithm for binary classification tasks, widely used in medical research.

  • Gradient Boosting Machines (GBM):

Builds multiple weak learners sequentially, known for its high predictive accuracy.

  • Neural Networks:

Deep learning models that can capture complex patterns in large datasets, effective for predicting vaccine responses and adverse events.

  1. Vaccine Development:

  • Computational abilities, prediction, and simulation have transformed drug discovery.

  • Scientists can now simulate how potential drugs interact with targets in the body.

  • This helps identify promising drug candidates more quickly and accurately.

  1. Supply Chain Management:

  2. Distribution Strategy:

5. Different types of Vaccine Doses administered in India.

Using the above methods of Data Science, AstraZeneca in collaboration with Oxford university has produced and supplied Covid-19 vaccines worldwide in the most challenging times of the decade yet.

Future Directions

Even in the post-pandemic era, Data science has a proven potential to be a pioneer and critical resource in the healthcare industry. It also is promising to be one by leveraging the most demanded and relevant yet new technologies such as Machine learning, Big Data analytics in the future.

In the coming times, Data Science is very strongly committed to produce advancements like precision medicine, the latest one in this rolled out, where treatments are very much personalised as tailormade to an individual considering factors such as medical history, medicine suitability, tablet tolerance etc,. Wearables and IoT will be detecting early symptoms and disease starting with which the disease could be detected before it becomes serious enough, timely and accurately.

In the 21st century, Data science has already caught its stake by playing a crucial role in the dimension of vaccination and medication during the times it needed to be the most. It's evident that the position is further being concrete as it is making aspects such as operation and surgical procedures easy using Data. In the long run, health care and quality medical amenities prove to be accessible to everyone with the technological advancements Data science promises.

Learning Data science and mastering is as fascinating and fun as knowing about it. You could be a Data Science professional as well. People always have a question of how to become a Data Scientist without having some fundamentals of it as a prerequisite. We at CourseVita have Data Science courses with us, which helps learners master the concepts of Data Science from the very basic to advanced level of concepts, paving the way for your promising career in Data Science. Click here to check out our Data Science courses, tailor made for you to land your dream job in the healthcare sector.


The above mentioned data shows how significant a player Data science could be in the long run in the medical and healthcare industry. The case study discusses how AstraZeneca has used Data science to distribute Covid-19 vaccines in the darkest hour of the century so far, the pandemic. The process of how data is collected, stored and managed is to know how data and its governance would impact healthcare of people in the long run. While these are just the glimpses of what Data science could do to transform the way healthcare industry is being run, it’s evident that there lies a bigger picture of Data science working wonders in the coming times.

1 view0 comments

Recent Posts

See All


bottom of page