July 23, 2020 Reading Time: 5 minutes

After a months-long struggle to contain COVID-19, New York Governor Andrew Cuomo seems confident that he has steered the state to safety at long last. He has hardly shied away from expressing pride over his pandemic policies, spurring headlines that derided his “gaslighting” and his “cringeworthy victory lap.” 

And Cuomo landed squarely in the good graces of White House health advisor Dr. Anthony Fauci, who, on July 18, lauded New York State’s handling of the crisis. 

Fauci declared, “They did it correctly.” 

This interpretation was curious, given that the state’s pandemic prevention efforts have been marred by grave errors (which AIER has previously covered). 

Those errors became the subject of deep scrutiny in a recent academic article entitled, “A Case Study in Model Failure? COVID-19 Daily Deaths and ICU Bed Utilisation Predictions for New York State.” 

Written by a team of mathematicians, biostatisticians, and data scientists from Stanford University, the University of Texas at El Paso, Northwestern University, and the University of Sydney, the paper takes direct aim at the decision support tools––models––that were influential in shaping New York State’s policy response to the disease spread. 

The four models reviewed are those produced by the Institute for Health Metrics and Evaluation (IHME), Youyang Gu, the University of Texas at Austin, and the Los Alamos National Laboratory. Though they were widely cited and enthusiastically implemented, these models fell short: 

Forecasting models have been influential in shaping decision-making in the COVID-19 pandemic. However, there is concern that their predictions may have been misleading. Here we dissect the predictions made by four models for the daily COVID-19 death counts between March 24 and June 5 in New York State, as well as the predictions of ICU bed utilisation made by the influential IHME [Institute of Health Metrics and Evaluations] model. We evaluated the accuracy of the point estimates and the accuracy of the uncertainty estimates of the model predictions … For accuracy of prediction, all models fared very poorly. Only 10.2% of the predictions fell within 10% of their training ground truth, irrespective of distance into the future … For ICU bed utilisation, the IHME model was highly inaccurate; the point estimates only started to match ground truth after the pandemic wave had started to wane. 

As COVID-19 first began to spread, ICU bed predictions were indeed dire. In March, Governor Cuomo declared that New York would require between 18,600 and 37,200 ICU beds to treat the impending wave of COVID-19 cases. Compare that to New York’s reality: 3,000 ICU beds available as of Cuomo’s assessment. If these horrific prophecies came true, the state’s medical system would have been entirely overwhelmed. It was these concerns, driven by model predictions, that informed the strategy that came to be known as “flattening the curve:” attempting to preserve medical resources by blunting the rush of all but the most dire COVID-19 cases to hospitals and healthcare facilities. 

How did the models backing the curve-flattening perform?

The accuracy of point estimates––which is to say, the actual daily death count predictions generated by each model––is evaluated using two metrics: the mean absolute percentage error and the maximum allowable percentage error. The former, in a straightforward way, calculates the percentage difference between a given model’s prediction for a specific day and the actual result on that day. The latter takes the maximum of the absolute percentage errors for each forecast and each model. 

The authors of the paper find that “while some models may perform better or worse over subsets of the time frame of interest, no one model clearly dominates throughout [the time period] with respect to either of the metrics.” Across them, only 10.2% of the daily death predictions fall within 10% of the actual outcomes. 

Of the paper’s two major findings, one is nontrivial but unsurprising: the methods of data collection and the verification of data quality exercise great influence over models. Poor data makes good models bad, and bad models worse. A related issue arises:

[e]arly on, Dr. Anthony Fauci, NIAID Director, stated that: “As I’ve told you on the show, models are really only as good as the assumptions that you put into the model. But when you start to see real data, you can modify that model…” An open question raised … is how can one expect quality predictions, if the data are faulty? … Clearly, if the data are suspect, projections may also be sub-optimal. 

The second conclusion, of considerably more gravity, bears repeating:

Models need to be subjected to real-time performance tests, before their results are provided to policy makers and public health officials. In this paper, we provide examples of such tests, but irrespective of which tests are adopted, they need to be specified in advance, as one would do in a well-run clinical trial. 

Only the Los Alamos National Laboratory model “was found to approach the 95% nominal coverage,” but it was not available when Cuomo was forging his March policy decisions. Furthermore, a model that gradually becomes more accurate is functionally irrelevant, given that it is in the early stages of a pandemic that sound and timely decisions are paramount. 

Such dire ICU bed utilization predictions never came to pass, at least not to the degree that was so widely feared, but the sense of urgency surrounding them compelled New York State leadership to prepare accordingly. The models were run, their purveyors advised, and policymakers listened. 

Thousands of recovering coronavirus patients were sent to nursing homes, landing in settings that even Cuomo has called “the optimum feeding ground for this virus.” Moving elderly COVID-19 patients back into nursing homes––which produced a contagion effect that has been likened to “fire through dry grass”––was done in the name of flattening the curve, keeping hospital beds and other resources from being overwhelmed as the initial spread proceeded. But devastation followed. As of July 15, 42 percent of all U.S. deaths were linked to nursing homes, amounting to more than 57,000 lives lost––over 6,000 of them in New York. 

Needless to say, there will be no recognition of the role that awful models played in the formulation of practices that extinguished so many innocent lives. Real-time evaluation of prediction models would allow policymakers and medical experts to arrive at better prevention methods if their established plan of attack was proven inaccurate––or at the very least, they could tailor the rigidity of policy responses to the evinced exactness of outcomes. But such an approach was clearly absent in New York State. In terms of predictions of hospital resources being utilized, the degree of error between forecasts and reality was so great that it could have counterbalanced the fervor with which nursing home residents were sent back to those facilities to preserve hospital beds. 

Not long after that decision, adding deep insult to unfathomable injury,

[i]n the chaotic days of late March, as it became clear that New York was facing a catastrophic outbreak of the coronavirus, aides to Gov. Andrew M. Cuomo quietly inserted a provision on Page 347 of New York’s final, voluminous budget bill. Many lawmakers were unaware of the language when they approved the budget a few days later. But it provided unusual legal protections for an influential industry that has been devastated by the crisis: nursing home operators.

In light of these findings, it’s difficult to determine which is more of an affront to the families of the thousands upon thousands of victims of New York State’s handling of COVID-19. Is it the shameless self-aggrandizing and celebration by Cuomo, whose policies led to the devastation of the elderly in long-term care facilities? Is it Dr. Anthony Fauci’s commendation of those policies? Or is it the legal protection of the entities that abetted such widespread demise? 

Ultimately, there is plenty of blame to go around. But one thing is certain: the deep human costs of this virus and its corresponding policies have been exacerbated by blind hubris and unfounded pride––which, to no small extent, saw their expression in the conceit of rule by models. Though it might be too little too late, diligent reflection and readjustment in future pandemic combat efforts will be the only way to honor those who were lost so unnecessarily. 

Peter C. Earle

Peter C. Earle

Peter C. Earle, Ph.D, is a Senior Research Fellow who joined AIER in 2018. He holds a Ph.D in Economics from l’Universite d’Angers, an MA in Applied Economics from American University, an MBA (Finance), and a BS in Engineering from the United States Military Academy at West Point.

Prior to joining AIER, Dr. Earle spent over 20 years as a trader and analyst at a number of securities firms and hedge funds in the New York metropolitan area as well as engaging in extensive consulting within the cryptocurrency and gaming sectors. His research focuses on financial markets, monetary policy, macroeconomic forecasting, and problems in economic measurement. He has been quoted by the Wall Street Journal, the Financial Times, Barron’s, Bloomberg, Reuters, CNBC, Grant’s Interest Rate Observer, NPR, and in numerous other media outlets and publications.

Get notified of new articles from Peter C. Earle and AIER.

Fiona Harrigan

Fiona Harrigan

Fiona was a Research Intern for AIER

She is currently an associate contributor for Young Voices. Her writing has been featured in the Wall Street Journal, the Orange County Register, and various other national and local outlets. Prior to joining AIER, she worked for the Foundation for Economic Education.

Get notified of new articles from Fiona Harrigan and AIER.