Reflecting about SIR models and some examples



Reflecting about SIR models  and some examples

Feel free to send me you thoughts either in comments or here.
This only reflect my personal opinion, none of my employees at any time.

Summary


  1. SIR model trendy, but they also have some caveats
  2. Looks like in Europe and the States the peak could be reach between May and July, with different amount of population affected by country given different policies implemented (so the model need time to incorporate that effect).
  3. Potential limitations:
    1. Summer could reduce prevalence.
    2. Data limitations, brief discussion below. But we are learning about Covid19, we did not have enough tests and so on.

Intro

After massive disruption for Covid19 everyone is concerned about what is going on and how this could evolve in the future. Massive disruption has occurred and all of us are noticing the consequences (such as compulsory stay at home measures, terrible amounts of deaths and so on).

All of us want to help a bit and trying to understand the situation better. Because I work as Data Scientist I guess I try to focus more in the data side and in how our community can help, which is very active by the way, with hackatons for this topic and so on (see example of video of our project, jointly with George Joseph  and Rui Saraiva , but that whose more in terms of ARIMA models.

SIR models (and variations) are becoming very trendly now in order to understand  better when the Covid19 peak is going to happen for how long and so on.  They key advantage is that they could help you to predict the peak and the size of the main waive and how much people could be affected.

How SIR models looks like?

Remember key advantage,  peak when and how much people.
Here some examples estimated with todays available data for some countries. And you can see how this changes for different countries. 
All pictures are home cooked, and the code is in links below, we are working on them to improve them currently.


June-July is the peak, massive amount of the population could be sick at the same time 


France as Cyprus could have similar period of time but the proportion of the population is much lower, so probably some of the measures for France's government are already into place.

India appears may have a core impact but later in the years. That makes sense since their current amount of cases is much reduced that other countries at the moment.






Italy, in line with France, Spain, Cyprus, US and United Kingdom may have their peaks on May-July.
All of them with around 30 - 40 % of the population sick at the same time. However US, India and Cyprus could have a much higher amount of people sick at the same time.  Probably as previous countries if they start introducing measures that could be reduced.

Potential limitations

  1. As you see those peaks may happen on summer / end of spring. However we do not know if the prevalence of the virus is going to be has high as in winter.
  2. Implementations of policies need time to be appreciated by the data, so those graphs are continuously changing.
  3. Even SIR model are simple there are more sophisticates techniques and models, they could present difficulties when estimating the parameters. For example  there are extensive work about how to do that properly Stan package
  4. SIR models give us a long term predictions but the longest we go the less accurate could be.
  5. Data: is the current data the best? Depends what for work. (I recommend you to see Dr John Ioannidis video):
    1. There is not clarity, for instances see: 
      1. From Flaxman, S., Mishra, S., & Gandy, A. (2020). You can see the variation between lower and upper bound are massive. Take Spain as case, we are somewhere between 4% and 41%.
    2.  Current data from hospitals is biases since it come from people who actually need treatment. I was doing some work for survival analysis (see reference below) for South Korea population, and you could fine cases update after deaths.
    3. The number of test is very small in many countries, initial success for South Korea and Germany could be because they massively test the population. While in Spain or Italy that was very limited.

Work with

George Joseph and base on other previous materials such work from Lewuathe.

References

  1. All pictures are reproductible and obtain using this: https://github.com/rafaelvalero/covid_forecast
  2. SIR models:  https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology#The_SIR_model
  3. Some interesting Kaggle notebooks: https://www.kaggle.com/lisphilar/covid-19-data-with-sir-model
  4. The reference work for the UK from Imperial
    1. Flaxman, S., Mishra, S., & Gandy, A. (2020). Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries. Imperial College preprint.:  here paper, Code  
  5. Github (some cool repos)
    1. https://github.com/rafaelvalero/covid_forecast
    2. https://github.com/gasilva/dataAndModelsCovid19
    3. https://github.com/Lewuathe/COVID19-SIR
  1. Videos
    1. Perspectives on the Pandemic. Dr John Ioannidis. Standford University: https://youtu.be/d6MZy-2fcBw
    2. Gerardo Chowell-Puente video about different forecasting techniques: https://youtu.be/e5CLt60Agro

Other posts I have about Covid19:


Comentarios

Entradas populares de este blog

Making MongoDB remotely available.

Adicction and Decision Making: a brainy view.