Peak Predictions
What I learned forecasting the fourth wave of COVID-19 cases in the country of Georgia
Over the course of the last few months I have made several predictions about the timing and severity of Georgia’s fourth wave. (Note: I am referring here to Georgia the country, not the US state.) Given that we appear to have hit the peak of the wave, I’d like to try to evaluate those predictions, figure out what worked and what didn’t, and assess what I’ve learned about forecasting and about coronavirus waves.
My main takeaway is that local factors (local politics, medical system capacity, size of previous waves) are more likely to contribute to more accurate predictions, whereas factors estimated from an aggregate of international data (statistical distribution of peak height, estimates of NPI efficacy, estimates of the epidemiological characteristics of the virus itself) are more likely to mislead you or at least divert your attention away from local factors that you should be paying attention to. This isn’t to say that there are no general lessons to be learned from looking at aggregated data - rather, that it’s very tricky to figure out when and how to apply those lessons to a specific case.
I’m going to discuss a couple of the strategies that I have used to generate predictions about Georgia’s fourth wave, how they went, and what I learned. Then I’ll talk a little bit about the covid- and prediction-related topics I intend to explore in future posts here.
Prediction via politics
On July 11th, I wrote,
I think we could be looking at the worst wave yet – the second wave was what happened when we tried to power through a surge in the original strain with no restrictions; now we’re going to try to do the same with Delta.
It now seems that this concern was warranted: by some indicators, the 4th wave has indeed been the worst, although we won’t exactly know until it’s over (and maybe not even then - see the discussion below about how we judge wave severity).
This assessment was based on several pieces of information - one, low vaccination rates; two, high transmission rate of the Delta strain, which was already in Georgia at the time; and three, most importantly, the government’s repeated, credible statements that no restrictions were planned. In Georgia’s 2nd wave, the government stuck by its promise not to introduce lockdown measures until and unless hospitals became overwhelmed, and indeed it was not until pictures started flooding social media of ambulances lined up outside overcrowded hospitals that the government took steps sufficient to reverse the tide. Based on the politics involved, my assessment was that the government would behave the same way in this case - which, indeed, they did. In the 4th wave, anti-covid measures came just as the government began setting up a field hospital in Tbilisi to deal with patient overflow from hospitals.
However, I didn’t assign any particular confidence to this prediction. I was reluctant to predict disaster too strongly because I’d had similar concerns about the third wave (Georgia’s Alpha wave) that didn’t really pan out. In other words, I knew the policy factors were in place to allow Delta to spin out of control, but I didn’t know if it actually would. In the future, I think I could be a bit more confident in making predictions which center on Georgian public health policy and the political system in general.
Prediction via comparison to Georgia’s second wave
On July 18th, I wrote,
I think we'll top out, as before, at around 30,000 cases/week. At current growth rates we'd get there in 4-5 weeks. Gov't might lower that peak with a lockdown but I doubt they will. So, get used to bad covid news for about a month.
This forecast was based on a cursory look at peaks in other countries and comparing them to Georgia’s 2nd wave, on a per capita basis. For a cursory forecast, it seems to have performed remarkably well: it appears the peak in cases/week occurred on August 18th - four and a half weeks after this tweet - at 35178 cases per week, which is 17% higher than 30,000. I usually give myself a 22% margin of error on weekly forecasts, so 17% for a forecast more than four weeks out is pretty remarkable. I’d love to take full credit, but I have to be honest and say there was more than a little luck involved in hitting that level of precision.
One of the assumptions underlying this forecast was that the growth rate would remain consistent (I spelled it out here - “At current growth rates”). However, there was no real reason to believe the growth rate would stay the same - in fact it spiked in the next week, which threw off my July 25th projection, before settling back down to around the July 18th level. It turns out I just got lucky that July 18th happened to be the week where the growth rate, as of Sunday, was closest to the average for the ascending side of the wave.
My takeaway is that I need to factor “at current growth rates” out of my predictions in some way - perhaps by running a few scenarios under different growth rates, and then assigning probabilities to each scenario.
In terms of the peak height, I anchored on the 2nd wave because of the aforementioned political factors - the apparent determination of the regime to let the wave run its course until hospitals became overcrowded. This both worked and didn’t work. I didn’t factor in the fact that Georgia had increased its medical capacity since the second wave, meaning it would take many more cases to overwhelm the hospital system. In fact, while we maxed out at 31021 active infections in December, we’ve already hit 61236 active infections as of August 21st. I also didn’t factor in that different infection rates might change the threshold height for overwhelming hospitals - in other words, if we had a steeper curve this wave, we might overwhelm hospitals sooner; if we had a flatter curve, we might overwhelm hospitals later. It looks like what actually happened was that these effects cancelled each other out somewhat: we had more hospital capacity, which would have moved the peak later; but also a steeper curve, which moved the peak earlier.
Prediction via comparison to waves in other countries
An interesting thing about covid - and other pandemics that have come in waves - is that no wave infects the full population. You run an SIR model and it runs until basically the whole population gets infected and then recovers and becomes immune. That’s not what happens in the real world - there’s some factor, or set of factors, that seems to end each pandemic wave long before the whole population gets infected and recovers. After Georgia’s huge second wave, estimates were that about 30% of the population had been infected. In some cases (e.g. Manaus) local experts had speculated that after the first big wave of the pandemic, most people in a particular location had been infected, only to find a second big wave which disproved that speculation - so it seems that each wave tends to infect less than half of a population.
What actually determines the height and duration of pandemic waves? This is a question which seems to be poorly understood. I’ve been looking into network models - based on the social graph structure of society - and it seems promising. More on that in a separate post, if it pans out. I’ve also seen speculation that members of a population may be divided into groups based on different levels of susceptibility (whether due to internal factors, like overall health/immunity, or external factors, like how many contacts a person has or how riskily they behave) and that each wave of the pandemic rips through the most susceptible people but then runs out of steam, until a new, more contagious variant comes along and hits the next-most-susceptible people, etc. In other words, waves correspond roughly to strains - which seems to hold true in Georgia, at least.
Obviously public policy can also impact wave height, but absent full vaccination this only seems to delay the cases. So Georgia’s “first” wave wasn’t a wave at all - suppression was world-class and incredibly effective - but then as soon as we let our guard down, the original strain caused the second wave, which, as I’ve mentioned, was one of the world’s highest. You could therefore classify Georgia’s waves thus:
First: original strain, suppressed, very low
Second: original strain, unsuppressed, very high
Third: Alpha strain, partially suppressed, medium height
Fourth: Delta strain, unsuppressed, very high
All that said, despite the obvious correlations between waves and strains, and between public policy suppression efforts and wave height, we simply don’t know enough about what determines wave height to create useful predictive models of waves - although many people have tried. That’s why I turned to a crude statistical analysis of world waves.
My first attempt wasn’t great:
Of all waves in every country, only 15 have peaked higher than Georgia's 2nd. Taking a conservative denominator of 600 waves (3 waves/country) a covid wave is 97.5% likely to peak at or below Georgia's 2nd.
There are advantages and disadvantages to this strategy. The main advantage is that looking at every peak in every country gives you a sizeable enough dataset that some of the noise will start to cancel out and patterns will start to emerge. One pattern that became apparent very quickly was that the highest peaks per capita were in smaller countries - particularly island countries like Maldives and Seychelles or microstates like Vatican and Andorra. Meanwhile India’s famously awful Delta wave was apparently smaller per capita than Georgia’s surprisingly mild Alpha wave (although there are indications that underreporting was worse in India than Georgia, so take this with a grain of salt).
It’s clear here that what’s going on is that the denominator in the “per capita” measurement matters a lot - if you have a billion people spread out across an entire subcontinent you’re going to get surges in one geographical location averaged out with non-surges in other geographical locations, meaning that your per capita numbers are going to be compressed. India is huge and geographically diverse, and obviously more comparable to the US or Europe than to Maldives or Andorra. The takeaway is that we can factor in country size and geography and population density when making predictions of peak height - or when making basically any cross-country comparisons. So if you want to compare, e.g., Belgium to France, the fact that Belgium is a lot smaller and more densely populated might turn out to matter a lot.
However, because countries are not necessarily comparable, doing a crude analysis of peaks without trying to control for anything might get you weird results. Like, it looks like Seychelles had a peak at 4083 cases per million, which absolutely dwarfs every other peak in the world, but also they’re a tiny island nation and they seem to report case numbers irregularly once or twice a week. So aside from differences in geography and demographics, countries also collect and record data differently.
So while basic statistics on “all peaks in all countries” might tell you a bit about what country-level factors might influence peak height, they might not be great for making individual peak predictions about individual countries. Or, given that Seychelles is the most extreme outlier, you wouldn’t assume that any future Seychelles wave would be predictable based on how things were going in India, France, and Belgium.
Or, as happened in Georgia, the same factors that make a country an outlier once could make it an outlier twice (or even three times, which happened in Czechia).
I'm standing by the statistics. Even building in some uncertainty, I'd say 95% chance we peak within 2 weeks at or below the level of the 2nd wave. It *feels* crazy to predict that but it's what the numbers are telling me.
I was right to say it felt crazy to predict that - this is a case where I should have listened to my gut, and dug in deeper until I understood exactly why it felt wrong. Lesson learned.
I made a calculation that told me that statistically there was a 2.5% chance that Georgia’s fourth wave would exceed the second. I doubled that chance based on the fact that these events were not independent, but that wasn’t quite enough. Also, the calculation itself was wrong.
Recall that I said I was using 600 as my denominator - an estimate of "all peaks in all countries”. Assuming you had absolutely no information about which peak you were in, if I asked “what is the chance a randomly selected peak will exceed Georgia’s 2nd” then “2.5%” would have been a reasonable guess. But by July 25th, we did have information about which peak we were in - we knew we were in a peak of at least 585 cases per million. There were only 107 peaks of that height or higher, which means my denominator was about six times too high. Statistically I should have said there was about a 16% chance of exceeding the 2nd peak. Factoring in some uncertainty, I should have been no more than 80% confident, as of July 25th, that we wouldn’t exceed Georgia’s second peak. And even that may have been too high - it may not have sufficiently accounted for the fact that Georgia’s second and fourth waves were not independent events, but rather correlated due to having similar social, cultural, political, and environmental factors. (Also, I’ve already addressed this, but I got the timing wrong by assuming a fixed growth rate - the 25th was definitely not my best week for forecasting).
As I said earlier, the outcome was that Georgia’s fourth wave seems to have peaked at 35178 cases/week - about 12% higher than the second wave, which peaked at 31317 cases/week. The fourth wave exceeded 31317 for the first time on August 13th and peaked on August 18th, only five days later. 12% and five days are very small errors. So while the forecast ended up being wrong, it wasn’t very wrong: I said it probably wouldn’t be worse than the second wave, and it ended up only being a little bit worse.
I think as a forecasting model, this method of prediction is definitely better than no model at all, and perhaps a good start for choosing an “anchoring point” for a prediction, but I think I can probably improve upon my result by using judgment to adjust for specific local factors. I might also be able to improve this model by adjusting wave height based on factors like the positive test rate (assuming a higher rate means more missed cases, so adjusting upwards) so that waves in countries with vastly different testing regimes might be more directly comparable to each other. That’s something I’m going to be looking at in future explorations of the data.
Another benefit of this method was that it prompted me to write a Python script to help analyze ourworldindata’s covid dataset - that was how I counted peaks - and having that script as a starting point opens up new avenues of analysis (as well as new problems, like how to identify a peak algorithmically) which I may discuss in a future post.
A final takeaway from this method: once I remembered to incorporate information about current height into the analysis, it turned out that the higher we got, the more likely we were to exceed the height of Georgia’s second wave. In other words, the distribution of peak heights wasn’t linear. I may try to do some graphing and analysis of the distribution of peak heights to see if any other interesting patterns emerge.
Prediction via estimation of non-pharmaceutical intervention (NPI) efficacy
This, it turns out, is incredibly difficult, and I’d advise against doing it.
On August 15th, I wrote:
I don't think the announced restrictions will be enough to drive a peak in cases. I'd guestimate they'll lower the growth rate to about 1.1 - that's about what we were looking at before the December lockdown. If that happens we'll see ~37K cases next week.
As it turns out, there was a peak in cases.
Specifically, the announced restrictions were:
Outdoor mask mandate
Public transportation closed for three weeks
Bars and restaurants close at 23:00
If I had to guess I’d say #2 made the biggest impact. I’ve been back and forth arguing over curfews and even at their most stringent I think they only get you maybe a 10% reduction in transmission, and 23:00 for bars and restaurants only is far from the most stringent. The mask mandate in Georgia suffers from uneven enforcement, compounded by a fine amnesty from the last round of mask mandates, when the government thought the pandemic was over and was worried about its slipping popularity. I’d say it’s probably having some effect, but anecdotally having visited a few neighborhood shops since the mandate, and ordered a ton of delivery, basically no stores are enforcing mask mandates and almost no delivery people are wearing masks inside my building. Some other local folks on twitter are seeing a bit more mask compliance than I am so it may be different based on neighborhood - I live on the outskirts of town where there are never police patrols.
So I think the loss of mobility from the public transit shutdown - especially for people who can’t afford regular taxis (i.e. almost everyone) - is probably the most effective of the restrictions. It’s also generating a lot of complaints, which tells me that it is in fact restricting people’s activities.
I’ve found throughout the pandemic that making predictions about any particular set of restrictions is very difficult. It often seems - both here and abroad - that restrictions that have seemed underpowered have ended up causing peaks. It’s almost as if the system is in one equilibrium state - say, a growth rate of 1.3 - and even a small perturbation - an NPI which reduces R by .1, say - is enough to jolt the system into a new equilibrium state - not at 1.2, which you’d predict, but at 0.9.
I’ve seen this described as something like a network effect or signaling effect of government intervention - in other words, by issuing any sufficiently serious-sounding policy directives, government solves the “coordination game” where everyone knows we’d benefit by changing our collective behavior, but nobody wants to be the one to start. Because of peer pressure, nobody wants to be the first person to opt out of the weekly game night because of covid risk, or the only kid in class wearing a mask, or whatever - but if the government (or some other legitimate authority) says “okay, it’s time to take precautions now” then many people will gladly switch over into that mode and the sum total of precautions taken in society ends up being much higher than the minimum floor which the government has mandated.
The bad news for prediction is that we don’t really know when this effect kicks in, and also we don’t know whether it’s at all proportional to the effect size of the government intervention. Can a policy which reduces R by .1 drive a reduction of .5? Will a -.2R policy be more effective than a -.1R policy, or will the signaling effect completely obscure the difference in efficacy between the -.1R and -.2R interventions? How do social, cultural, and economic factors mediate this effect? My intuition tells me it’s stronger in collective societies and weaker in individualistic societies, but I have little more than intuition to back that up.
There’s also the question of “the control system”, which is how some covid analysts describe the process by which people react to news about covid by adjusting their risk tolerance. When the news is good, people take more risks; when the news is bad, people take fewer risks. So just as the government reacted to hospitals overflowing by issuing public policy interventions, individuals might react to hospitals overflowing by individually mitigating their own risks. In fact they might have done that regardless of government intervention. The fact that the NPIs correlate with a drop in cases doesn’t mean the NPIs caused the drop in cases - instead, the NPIs and the drop in cases could both have been caused by news of hospitals overflowing. Or perhaps NPIs and the “control system” work together - maybe NPIs accounted for a drop in R of .2 and the control system also accounted for a drop in R of .2, giving us a swing from 1.3 to 0.9.
I worry that people’s answer to this question is more ideological than scientific. People who are already skeptical of NPIs and government in general are more likely to say it was individual behavior that caused the drop, and people who are already skeptical of individual responsibility as a solution to social problems are more likely to say it was the NPI that caused the drop. But effects can have multiple causes. If the drop in cases was caused one third by the NPI itself, one third by the signaling effect of the NPI, and one third by the control system, then we still got most of the effect from government action. Even if it was one quarter from the NPI, zero from the signaling effect, and three quarters from the control system, it still might be the case that the NPI is what pushed us over the edge into a decline rather than a plateau.
Personally, my experiences during this pandemic lead me to believe that just relying on individuals to take precautions of their own accord will not be enough to stave off very bad outcomes, such as shortages in medical equipment/personnel/facility space. I think there is clearly a case for government action here, even if I am very uncertain about its relative or absolute effect size.
All in all, it’s clear to me that predicting the impact of an NPI is not just complicated, but much, much more complicated than it seems.
Areas for future inquiry
A common narrative I’ve been fighting against during this pandemic is that no one can really know anything. The experts were all wrong, nobody predicted anything, we didn’t and could have seen this coming, and so we might as well all get on with our lives as if nothing has changed and nothing can ever be learned. I call this “intellectual nihilism”. The consequence of this set of beliefs is a reflexive distrust of anyone who does something like try to warn you that there’s a big wave coming.
I wrote in December 2020 that schools would not be able to open and operate safely this year. As one might expect I didn’t get all of the details exactly right, but here’s what I did predict:
Georgians are going to demand a tourist season this summer – a bigger one than last summer, if they get their way – and this is going to lead to another wave, and probably an earlier wave
and
trials in children probably won’t be done in time for the next school year – so there’s a strong chance that even if we get vaccines, we won’t be able to vaccinate children yet
and
I’d give it about a 30% chance that Georgia successfully obtains and deploys enough vaccines to stop coronavirus through herd immunity by September
None of these things were hard to see coming, at all. The December 2020 wave had started in late August - this wave started in early July, just about two months earlier. Trials in children are not done. Georgia has not obtained and deployed enough vaccines to stop coronavirus through herd immunity. Put them together and the conclusion is obvious: it is not likely that schools will operate normally.
Now, Georgian health officials are scrambling to come up with a policy which protects children but also satisfies the public’s desire to get their children back to some semblance of normal life. We’ve heard various statements - schools can open if the nationwide positive test rate drops below 4%, or if things are better in two weeks in some vague undefined sense, or if 80% of the school staff is vaccinated. Believe me, I’m all for a transparent metric which will tell us under what circumstances schools are to be opened, but ideally I’d like them to choose one soon rather than floating a variety of ideas in the media to see which ones get shot down by public outcry (which is actually not terrible as a democratic process - just provided it has a definitive end point after which we get a coherent and predictable policy).
But we could have decided all this months ago. Last year, even. Schools could have prepared for the need to operate flexibly this year starting last year, rather than waiting until the first day of school to find out the pandemic is too bad and kids have to go online, again. I can totally understand being caught off-guard in March 2020. Being caught off-guard in September 2020 was less understandable. Caught off-guard again in September 2021? Come on now.
The idea that we wouldn’t be in this situation with respect to schools - in Georgia, in the US, and presumably elsewhere - was always wishful thinking run amok. Instead of figuring out how to have education during a pandemic, we’ve wasted 18 months pretending things will just go back to normal in a few weeks and/or pretending that things have already gone back to normal and then lashing out at people who point out that actually they haven’t.
All this is to say, I’m interested in the question of how we can make people take predictions seriously. How we can fight against intellectual nihilism among the general public, or at least among policymakers who are in charge of our most important institutions. Because if I’m getting 80% of my predictions right (which I am, based on my analysis of my weekly projections this year) then there’s enough information out there to make better - not necessarily perfect, but definitely better - decisions. It’s just a matter of getting the right information to the right people.
Relatedly - as a teacher - I’m interested in how schooling could have been conducted during the pandemic, what we could still do to salvage this upcoming year, and what we can do to improve education in general.
Another question which arose during my frequent comparisons between Georgia’s second and fourth waves is the question of how to accurately compare waves. I’ve been using “new reported cases over a period of seven days, per capita” a lot but one thing to note is that Georgia has more than doubled its testing capacity between the second and fourth waves, meaning that while the fourth wave had more reported cases per capita, it seems likely that we actually had fewer actual cases per capita. Mortality may end up being a better indicator, but that may be confounded by differences in treatment, differences in age/risk level of the susceptible population, steepness of the wave, and other factors. Analyzing indicators for indicativeness will be difficult, and then there’s also the judgment call of asking whether a wave with more cases but fewer deaths is better or worse than a wave with fewer cases but more deaths (although I’m all for fewer deaths, it’s not obvious how to compare e.g. one person’s death to 100 people’s non-fatal illness or 10 people’s long covid).
Related to the question of whether school will be normal this year: now that we’ve apparently peaked, we need to know how quickly cases will decline, and what the end of the wave looks like. Will we drop to 5000 cases/week and then start going up again, within a month, like after the third wave? Or drop to 2000 cases/week and have a two to three month respite, like after the second wave? I’ve already begun working on some prediction scenarios for this, which I believe will be the subject of my next post here.
Finally, I’m going to explore modeling and datamining. I’m considering an introductory post on how you can build a simple exponential growth model at home using spreadsheets. For myself, I’m wondering about questions like how to algorithmically detect peaks in a covid dataset, how to look for, find, and interpret patterns in covid data, and how to implement more complex pandemic models - like social graph models or hybrid compartmental/graph models - to try to gain insights about the nature of pandemics in general. I’ve been having a lot of fun using Excel and Python to understand the pandemic but I always feel like I could be doing more.
Get the Shot
I’ll just end with a public service announcement: I recommend you get vaccinated. If you’re in Georgia, the vaccine portal is provax.ge. I took the first vaccine which was available to me - Sinopharm - and have no regrets; as I’ve pointed out before Sinopharm seems to be holding out fine, even against Delta, despite the constant unending wave of bad press and handwringing from Western outlets. I’d only advise you to wait if your only option were Sinovac and if you anticipate being able to get a better vaccine soon (within a month or less). In any case, you can always get a booster later.
My data on covid numbers in Georgia comes directly from stopcov.ge and ncdc.ge, both of which are official sites operating under the authority of the Georgian government. Data for other countries comes from ourworldindata and their downloadable dataset on github.