The Covid-19 pandemic: Failure to predict, failure to report. Part Two -- Attack of the killer preprints (II)
This is the second segment of a post about new studies that are challenging the natural origins hypothesis for the origins of Covid-19.
Did the toilets have it?
Please see the first segment of this post before reading this one!
Before discussing the preprints challenging the Worobey and Pekar papers, I want to return briefly to Alina Chan’s preprint, which she just posted on October 28: “Evidence for a proximal origin of SARS-CoV-2 in the wildlife trade is lacking.” Chan prepared this paper long before the Worobey/Pekar papers were preprinted, and so she does not address them directly. However, her preprint is very relevant to the discussion of the market origins hypothesis. Chan’s bottom line is that, in contrast to the SARS outbreak of 20 years ago and the MERS outbreak of 10 years ago, no “intermediate hosts” for Covid-19 (animals that caught the virus from bats and then passed them on to humans) have been identified despite a very concerted search.
As Chan discusses in great detail, this has not been for lack of trying. As she points out, “tens of thousands of animals have been sampled and tested by independent groups for SARS-CoV-2 or closely related viruses, and none have been found in the Wuhan markets or its wildlife trade supply…Chinese investigators reported that they had tested 80,000 animal samples across 31 Chinese provinces, including from the Wuhan Huanan seafood market, Wuhan city, and elsewhere in Hubei province” but come up trumps.
One of the most interesting parts of Chan’s study is her detailed comparison of the efforts to track the origins of SARS-CoV and SARS-CoV-2. Public health officials and other scientists were able to trace not just one, but several zoonotic transfers of the SARS virus in Guangdong province from its intermediate host—most likely palm civet cats—to humans. Moreover, researchers were able to find evidence that a significant number of wildlife traders had antibodies to the SARS virus. No such evidence has surfaced in the case of SARS-CoV-2. It’s as if the virus suddenly emerged in the Wuhan market in late 2019, but has left no trace before or since—not in animals, not in wildlife traders, nowhere and no one.
Now for the preprints directly challenging Worobey and Pekar.
“Statistical challenges for inferring multiple SARS-CoV-2 spillovers with early outbreak phylodynamics” — Washburne et al.
Alex Washburne is also lead author on this study. The Lay Summary:
“It is not known if SARS-CoV-2 spilled over from animals into humans at the Huanan Seafood Market, or arose as a result of research activities studying bat coronaviruses. Two recent papers had claimed to answer this question, but here we show those papers are both inconclusive as they fail to account for biases in how medical managers became alerted to SARS-CoV-2 and how public health authorities sampled early cases. Additionally, key data points conflicting with the authors’ conclusions were improperly excluded from the analysis. The papers’ methods do not justify their conclusions, and the origin of SARS-CoV-2 remains an urgent, open question for science.”
This paper focuses on the “ascertainment bias” these and other authors believe compromises the conclusions of the Worobey/Pekar papers; in short, Chinese public health officials, early in the Wuhan outbreak, actually used exposure to the Huanan market as part of the case definition, thus possibly overlooking many early cases that were not linked to the market (such cases nevertheless have been characterized.)
My own comment on this and similar critiques is that the Worobey/Pekar analyses are based on cases no earlier than December 2019, whereas there is no firm evidence about when the earliest cases actually took place. In fact, the Pekar paper estimates that the first cases could have been as early as October. That missing data problem is made much worse by the refusal of Chinese officials to make records and samples from the earliest case available to the World Health Organization and other international investigators. The possibility that the two papers are examples of “garbage in, garbage out” cannot be dismissed.
“The geospatial data of Worobey et al. statistically links the Wuhan Institute of Virology with the Huanan Seafood Wholesale Market” — Andreas Lisewski, Jacobs University, Bremen, Germany.
This preprint was removed from the original preprint server without the permission of the author, after Science published a version of it as an eLetter in response to the Worobey paper (it can be seen at the bottom of the paper.) The author has protested the server’s action. As a convenience for readers, I have converted that eLetter into a pdf file. Lisewski showed that if a different statistical approach is used, using means rather than medians of the data, at least one cluster of early cases could just as likely be associated with the Wuhan Institute of Virology as with the Huanan market. In the original preprint, Lisewki concluded that the author’s statistical approach “resulted in a selective bias against an important alternative hypothesis that is supported by their own data.” Lisewski also faulted the authors for not considering this alternative explanation of their findings.
The eLetter in Science has an interesting history that I will be able to write about soon.
“Zoonosis at the Huanan Seafood Market: A Critique” — Zhang et al.
This is one of the most extensive examinations of the market origins hypothesis, and comes to a number of conclusions. I highly recommend it. Among its findings: The earliest known case of Covid-19 at the Huanan market was not at or nearly a wildlife stall; there is no statistical correlation between cases and the locations of wildlife stalls; environmental samples taken after the pandemic began are more consistent with spread from toilets at the market than with spread from wildlife stalls; and, to top it off, “there is no epidemiological evidence indicating any infection of a raccoon dog [a suspected intermediate host] or any other wild or domestic animal, before or during the early pandemic, at any market elsewhere in Wuhan, or even in the rest of China.”
Like Gao et al., these authors conclude that the Huanan market was more likely the site of a super-spreader event from humans rather than the original source of Covid-19.
“Unwarranted exclusion of intermediate lineage A/B SARS-CoV-2 genomes is inconsistent with the two spillover hypothesis of the origin of COVID-19” — Steven Massey et al.
This study is a full-on challenge to the Pekar et al. paper and its conclusions of two spillovers. As the title suggests, the Pekar team excluded a number of genomes not clearly identified with either of the supposed spillover lineages from their analysis, for reasons which this team does not consider valid.
“Statistics cannot prove that the Huanan Seafood Wholesale Market was the early epicenter of the COVID-19 pandemic” — Stoyan and Chiu
Like some of the other preprints, this one is heavy on statistical analysis, which I have no pretensions of understanding with any confidence. But the authors, a statistician and a mathematician, are obviously qualified to conduct this kind of analysis, whether or not their results can be corroborated by other researchers.
Circular arguments on the origin of SARS-CoV-2 — David Bahry
This short commentary again focuses on the possibility of “ascertainment bias” in the data that the Worobey and Pekar teams used to conclude that the Wuhan market was not only the site of an early spread of the virus, but actually the place where two zoonotic transfers took place.
As I commented above, the authors of these papers have publicly expressed a level of confidence in their findings that seems to go beyond what the evidence can bear (see Worobey’s Tweet at the top of the first segment of this post.) That makes it very important to look at, and take seriously, the limitations which the peer review process required the authors of both papers to include. Let’s take a look:
Study limitations [Worobey]
“There are several limitations to our study. We have been able to recover location data for most of the December-onset COVID-19 cases identified by the WHO mission (7) with sufficient precision to support our conclusions. However, we do not have access to the precise latitude and longitude coordinates of all of these cases. Should such data exist, they may be accompanied by additional metadata, some of which we have reconstructed, but some of which, including the date of onset of each case, would be valuable for ongoing studies. We also lack direct evidence of an intermediate animal infected with a SARS-CoV-2 progenitor virus either at the Huanan market or at a location connected to its supply chain, such as a farm. Additionally, no line list of early COVID-19 cases is available, and we do not have complete details of environmental sampling. However, compared with many other outbreaks, we have more comprehensive information on early cases, hospitalizations, and environmental sampling (7).”
Limitations [Pekar]
“Our analysis of the putative intermediate haplotypes suggests that there remain lineage assignment errors between lineages A and B, particularly of genomes sampled in January and February of 2020, which could influence the precision of the phylogenetic topology and tMRCA inference. We lack direct evidence of a virus closely related to SARS-CoV-2 in nonhuman mammals at the Huanan market or its supply chain. The genome sequence of a virus directly ancestral to SARS-CoV-2 would provide more precision regarding the timing of the introductions of SARS-CoV-2 into humans and the epidemiological dynamics before its discovery. Although we simulated epidemics across a range of plausible epidemiological dynamics, our models represent a time frame before the ascertainment of COVID-19 cases and sequencing of SARS-CoV-2 genomes and thus before when these models could be empirically validated.”
As we have seen, the authors of the preprints— some of which are in submission at journals, and most will be soon—see many other limitations. But the statements above suggest that the authors themselves realize their studies are not the last word, and of course there is no last word in science—only continuing research. Unfortunately, that modesty and humility in the face of what we still do not know is often missing in the social media posts of the very authors whose work is legitimately subject to critiques both before, during, and after the peer review science. That is science, and that is what the public—still reeling from the death and destruction of a devastating pandemic—has the right to expect.
Great article. Looking forward to this post too: "The eLetter in Science has an interesting history that I will be able to write about soon."