A very busy week, both on the response side – with commitments from the US, Europe, Africa, China and the UN amongst others – and on the science side. On the latter front, of particular note was the release of a New England Journal of Medicine article that represents a collaboration between national governments, the WHO and academics in London. Key to this work’s authoritativeness is the use of individual-level data to look in detail at the epidemic. Additionally, the CDC published its first take on the scientific front with a model based on a mix of historic and current outbreak inputs. The plan for this post is first to outline some of the methodological approaches of these two papers, and then to summarize how their predictions/estimates fit together, and fit with other literature. And then I’ll just keeping adding more topics until I run out of time/room/internet bandwidth.
- WHO NEJM paper. The WHO NEJM paper has the huge strength of having real data from individual cases in this outbreak, courtesy of working with the governments in each of the three countries.With this data they are able to provide classic outbreak epi figures like the weekly rates of new infections, geographic locations and symptomology. The benefit of all this real data is also the downside – there is very little modelling or projection here. It’s a well thought-through, very helpful discussion of the data, but not digging into ‘what if’ questions. For that, we’ll need to turn to:
- CDC Metlzer paper. The CDC MMWR paper feels like a mixed bag. On the one hand it carefully models the reported case data from Sierra Leone and Liberia (not Guinea, for data quality reasons?), validating its model based on data up to the end of August on data from first two weeks of September. On the other hand, it attempts to estimate the under-reporting rate, based on how many people would be expected to be hospitalized given case numbers on August 28th (last day of data used), and compared that to how many were actually hospitalized. It’s good that the authors have tried to account for missing cases – I haven’t seen anyone else try this yet and its an inventive approach. But this isn’t the most convincing analysis I’ve ever seen – as the authors note, there are several possible biases that would invalidate their assumptions. I’d love to see triangulation of this assumption using other methods. Their model is an SIIR (susceptible-incubating-infectious-recovered) model with an incubation period of 6 days and an infectiousness period of 6 days – based on past outbreaks in Uganda and DRC – and there are three categories of patient: those in hospital, those at home but with safe care, and those at home without safe care. The latter group are far more likely to infect than the others. It then proceeds to consider various scenarios and how they might reduce future spread. As I will discuss below. Oh, and they provide the model online in Excel format for you to adjust based on your favourite parameter estimates. Which I can recommend.
A. Modelling epidemic parameters
1. Data quality. As the Metlzer paper suggests, underreporting may mean that as few as 40% of all true cases are coming to light. This seems supported by the large number of cases and deaths found during the “Ose to Ose Ebola Tok” (house to house Ebola talk) sweep conducted by Sierra Leone during the three-day lockdown last weekend. How much longer case reporting and contact tracing continues to be comprehensive remains to be seen; a recent twitter conversation highlighted that eventually the cost-benefit balance may shift from case-finding to screening:
At which point what the data can, and cannot, tell us will change.
2. Reproductive rate.1 The NEJM paper this week provides a great deal of solid evidence on how many new cases are arising from each infection. The headline figure reported from this work is that the R0 is highest in Sierra Leone at 2.02, lower in Liberia at 1.83 and lowest in Guinea at 1.71. But the more meaningful2 figures right now are those for Rt over the last month of data: 1.38 in SL, 1.81 in Guinea and 1.51 in Liberia. This is more in line with the observation that Liberia’s epidemic is growing fastest, and that Guinea has seen a recent resurgence in cases after a few months where growth was almost flat.
3. Epidemic trajectory. This topic is the one open to the most speculation, since any epidemic curve that hits exponential growth will look very similar; the big questions are: (i) when it will peak out and fall off; (ii) how fast exactly is it growing? The former question is very hard to estimate since peaks are only hit when (a) the proportion of susceptible contacts begins to seriously decline (“natural” decline) or (b) when control measures kick in (“intervention-led” decline; see section below). For now, no-one thinks the “natural” limit is going to be reached any time soon, and to understand the impact of interventions, a baseline model without interventions is needed. Which is what most of the news stories have been covering. The variation in predictions can be frustrating, but as this very clear article describes, it depends on the date to which numbers are being projected, and the assumptions being made about the serial interval – the length of time between someone getting infected and their contacts getting infected3. To add a third layer of nuance, it can matter how long one is infectious for (longer time, more chance to infect, so this feeds into Rt); and estimates on this have varied by model:
In this case, the CDC estimates used historical data, while the NEJM used data from the current one. We should therefore probably trust the latter, except that the NEJM team only has data for a subset of those infected. So, uncertainty remains.
In order to try and sort out what has been done, I have built a small table that outlines estimates/assumptions and predicted epidemic sizes in the absence of intervention. I have simplified considerably – most models look at each country separately but I have generally collapsed them together. But this should give you a flavour:
Footnote: this table is limited, imperfect, and subject to revision. If I have misrepresented any study, please let me know and I’ll change the numbers. Refs: EbolaTeam; Majumder; Meltzer; Nishiura; Rivers et al.
My read is that most of these estimates are in the same ballpark – even numbers out by a factor of two only reflects a week or two’s delay in an exponential epidemic – with the exception of the under-report adjusted figures.
4. Case fatality rate. The good news is that estimates of the CFR are coming together; the bad news is that they are coming together at a higher level than the previously publicized figure of ~50%. The original figure was arrived at calculating the proportion of of Ebola cases (suspected, probable or confirmed) up to today who have died by today [M1]. Unfortunately, as many people noted, this biases the results downwards since anyone infected within the recent past will not yet have had time to recover. One way around this is to only include people with a confirmed outcome (death or recovery in the case of Ebola) [M2], but this can also introduce a smaller bias since those who die tend to do so sooner than those who recover. In a perfect world we would only look at those who have had long enough to die or recover – i.e. build a cohort that stops with those infected (or symptomatic – which is easier to measure) by the date X days before the last day of outcome, where X is the maximum time one can take to recover or fail to recover [M3].
The WHO NEJM paper helpfully provides many measures of CFR (see Table 2 or the extensive explanation in the eAppendix if you are following along at home). Their M1 is 38% for cases reported up to September 14, which is even lower than the previously accepted figure: but this isn’t surprising once the epidemic has taken off and each week brings more cases than were seen in the previous month. Their M2 is 71% – the headline figure in the press – based on all cases to September 14 with a definitive outcome. They don’t provide an M3, but if they had looked at the final outcomes for those symptomatic before August 18 say (so allowing for incubation and symptomatic periods to have almost certainly passed) that would be the number they got. Anyway, it’s worth noting that the 71% is almost exactly in line with the most recent calibration of Maia Majumder’s model (see also a presentation on this work) and slightly lower than the 75-85% I noted last week. And people working in the field seem to feel that this number is credible.
One side note concerning people is that Sierra Leone appears to have a far lower CFR than Guinea or Liberia, based on raw WHO figures. The closest I have heard to an explanation for this comes to date comes from Ian Mackay, who noted in a series of tweets earlier this week that SL uses a different definition of an Ebola-related death than the other nations:
I’m not clear if this definition is being passed on to WHO (who take their data exclusively from government sources) , but if so it might explain something. Or there may be another explanation that is behavioural, data quality or something else…
B. Mapping the epidemic
Our guest topic for the week is maps (because, who doesn’t like maps, they just convey so much information so quickly). Specifically, I wanted to list out some of the sources I’ve seen that map the epidemic, often in real time. Also, don’t forget that there are many sites doing real-time epidemic curves and similar (see several blogs I mentioned last week for starters). But this short list is about geography, and the display of data from different sources.
- WHO Ebola maps: Datasource: national government reports.These are the maps you may well have seen online, and in the NEJM. As a bonus, the grey/red overlays gives you a temporal sense of where the epidemic has cooled off, and where it is hot right now. UNICEF had a slightly different approach which was also very readable, but I haven’t seen any recent material from them; given their focus on prevention including social mobilization and water/sanitation this may represent the UN understandably divvying tasks to the most appropriate agency.
- Healthmap maps: Datasource: various public media sources, scrapped from the web. Healthmap has been using this approach to track/predict flu for some time, and has expanded into Dengue, vaccination and now haemorrhagic fever. While there is concern that it may not pick things up all the time (e.g. this piece on non-English language news), this automated approach avoids the need for active data requests.
- Crowdsourced maps for action: The opposite of healthmap, in some senses; people actively contributing to a central dataset. The first of these I saw was Cedric Moro‘s e-tracking via OpenStreetMaps, based on a WhatsApp group that reports suspected cases or other activity. More recently I found the Humanitarian OSM team’s work using the same infrastructure. The latter project is aimed squarely at providing real-time data for people moving around – I can only imagine this links to contact tracing or other control measures.
C. Stopping the epidemic
I would love to write lots about developments on treatments and vaccines, and on the health systems shortfalls and improvements. But others have this covered better than I (for example, CIDRAP produces a well-curated feed of Ebola-related news), and this post is getting unwieldy. However, I will comment a bit on:
1. Behaviour change.
There have been several calls this week for commitment to behaviour change. Given my interest in networks I particularly noted this article from NECSI highlighting the role of barriers and quarantine for preventing spread between communities. One on-the-ground programme of which I am aware is the effort by Irish NGO Goal to train policemen to provide effective and compassionate quarantining of affected households. I sense that there has been a shift in mood recent days from a rather negative view of travel restrictions/quarantines that was in play earlier in the epidemic to more acceptance of their role. I’m not clear whether this change is due to fear that bottom-up behaviour change won’t stop the burgeoning epidemic, or to the increasingly militaristic, control-based tone of those “fighting” the disease. I’m also not sure how I feel about the change: I felt the earlier anti-restriction language was a little too dismissive, but I’m always wary of movements to restrict individual rights in the name of public health. And while I’m vaguely aware that there has been a lot of education efforts aimed at reducing transmission risk at funerals and within homes, this doesn’t tend to hog the headlines, and so I don’t have a good feel for the relative weight of such work, or how the balance is changing. In conclusion, I look forward to seeing more ideas and approaches in the weeks to come, and hopefully to their discussion in the popular and scientific press.
As I noted last week, Rivers et al. have modelled several interventions already. Majumder et al. also show on Healthmap that very small changes wrought by generic interventions can have considerable impacts on their models – and thus potentially on cases/deaths. However, my impression is that the scramble to put in place interventions – particularly top-down distancing (e.g. quarantines) and improved basic healthcare (e.g. this call for simple acute-care efforts) – is not leaving time or political space for evaluation of the relative benefits of different interventions. The good news, for me, is that we now have several good baseline models which can be filled with realistic assumptions of intervention impact, and thus can provide this evidence very quickly.
1. Historical comparisons. This seems to be becoming a regular slot. This week, two papers on case fatality rates (CFR). First, a brief report by Adam Kucharski and John Edmunds in the Lancet shows a similar CFR pattern in the 1976 Yambuku outbreak to that seen this year: real-time values rose throughout the epidemic, while allowing for the time between symptoms and outcome (in that case a mean of 7.5 days). And second, a meta-analysis of CFRs for all the past 20 Ebola epidemics (full text behind a paywall, I’m afraid). The study finds:
- variation in CFR by strain (Zaire – the current one seen in West Africa – being the highest). I would caution that may relate to geographic distribution of outbreaks;
- reduction in CFR within Zaire strain over time. I would caution that this may relate to outbreak management learning curves;
- a mean CFR across outbreaks of 65%, with a 95% confidence range from 55-75%. So that would make the current outbreak CFR high, but not abnormally high.
2. Journalism worth reading. For all my pretentions in writing this blog, there are some people who bring everything they write to life so much better than I ever will. So here are some pieces that caught my eye this week.
- Tara Smith (associate professor of Epidemiology at Kent State) has possibly the strongest pedigree out there for science blogging on infectious diseases (cf her personal blog Aetiology). This week Smith wrote an update for Slate on Ebola: required reading.
- A great piece in the Atlantic on the modelling teams at Virginia Tech, MIT and the team at CDC. Proof that people do sometimes notice serious epidmiology, and even write gripping articles about it.
- There are quite a few first-hand accounts of working in the field out there. This interview with NPR by Daniel Bausch was particularly strong, his humanism shining through.
- And a brief overview of the whole situation.
1 It occurred to me after last week that not everyone has their head buried quite so deep in math modelling as I do. So for those of you who don’t read these terms all the time, a quick overview. The basic reproductive rate, R0, is the number of individuals who will get infected (on average) by a single infectious person dropped into a population where everyone else is susceptible to infection (and typically not making any specific efforts to avoid infection). If the number is greater than one, then the epidemic expands; if less it dies out. R0 can change with epidemic setting (it’s the product of the number of contacts you have, how likely each contact is to be susceptible and how likely a single contact is to cause infection; so might be different in rural Guinea vs Monrovia) but should be invariant for a given epidemic.
Things get more complicated once people have had the infection and recovered, or are vaccinated/otherwise protected, or start taking evasive maneuvers (e.g. for Ebola, not touching bodies at funerals, not touching other people generally). Now the number of infections generated by each infected person is likely to drop, so now we have an “effective reproductive number” or Rt, which can and will change over time. The threshold of one remains the key to stopping the epidemic. There are bells and whistles, but that’s the basics.
2 When I say meaningful, I mean in terms of how the epidemic is expanding and how much work needs to be done to get it under control – i.e. get Rt < 1 on a consistent basis.
3 In fact, the serial interval can be defined many ways, but the idea is that you measure from a set point in an individual’s infection timeline to the same point in the timeline of those infected by them. So you can measure from infection to infection, or symptomatic to symptomatic, etc. These choices can affect estimates, but not by a lot, and in practice the decision is usually driven pragmatically by data availability.