The Coronavirus epidemic may soon produce the greatest American disaster since our Civil War over 150 years ago, and numbers reveal the possible magnitude.

For example, *New York Times* Columnist Nicholas Kristof on Sunday reported the disheartening analysis of Dr. Neil Ferguson of Britain, one of the world’s leading epidemiologists. According to Dr. Ferguson the “best case” scenario is that the Coronavirus will kill over a million Americans.

Other possible projections are far worse. Last week, California Gov. Gavin Newsom issued orders completely locking down his entire state in order to halt the spread of the disease. He justified this decision by explaining that experts had warned him that without drastic changes in behavior, over 25 million Californians would become infected over the following couple of months. Such a calamity would obviously have produced to a total collapse of the state’s health care system, probably resulting in more than a million deaths. A million dead Californians by early summer…

The key to understanding the terrible danger of the Coronavirus is that there is no existing immunity and the disease is highly contagious. Therefore, under ordinary circumstances, the number of infected individuals tends to double every 3-6 days. A doubling-time of 3 days would multiply the ranks of the infected a thousand-fold during the course of a single month.

Thus, in late February Italian political leaders hardly regarded the virus as a serious national threat, but within just a few weeks much of the Italian health care system had collapsed and many thousands of Italians were dead. Despite a full national lockdown, the number of deaths in Italy has continued to rise exponentially.

Similarly, New York reported its first death on March 14th. Yet just ten days later, deaths in that state were running at 50 per day, and rapidly accelerating.

Unfortunately, although numbers are absolutely crucial for our efforts to combat this dread disease, we lack accurate American data, especially with regard to the rate of infection.

The problem is that any program of massive, widespread testing is completely impossible given our lack of sufficient testing resources. Meanwhile, the Coronavirus has a substantial latency period during which victims experience no symptoms, and even afterward many cases are quite mild or even completely asymptomatic, leaving the infected unaware of their condition. So at present, testing has been confined to just a tiny sliver of the population, ensuring that the reported numbers of those infected represents a severe undercount. But we are left to guess just how severe.

However, the Coronavirus death statistics are certainly far more solid and reliable, and I soon noticed a simple and easy means of reasonably estimating Coronavirus infections from Coronavirus deaths. But although the methodology seemed obvious to me, after I described it on a comment-thread yesterday, I encountered quite a bit of initial confusion and disagreement, suggesting that the idea was not nearly as obvious as I had assumed. So on the off-chance that some people might be unfamiliar with the method, I’ve decided to outline it below.

Let us note three crucial Coronavirus parameters, which have already been estimated by medical experts although they are obviously dependent upon particular conditions.

- The infection doubling period – probably 3-6 days
- The mortality rate – perhaps 1% prior to the collapse of a the local health system.
- The typical mortality period (time between infection and death) – according to some estimates, around 3 weeks.

Now consider a Coronavirus death. If we assume a mortality rate of 1% and a three week interval between infection and death, we can therefore estimate that there had been 100 new infections three weeks earlier. Next, if we assume a doubling-period of 6 days, those 100 infections would have increased to 100 * 2^(21/6) = over 1000 infections by the time the death occurred.

Therefore, under these particular assumptions (along with a few simplifications), the true number of total infections can be estimated at over 1000x the number of deaths.

Let’s apply this methodology to a real-life situation. On 3/23/20, New York reported 53 new Coronavirus deaths, bringing the total to 210. This suggests that the true number of new infections that day may have been over 50,000, raising the total infected to more than 200,000. These estimates are ** eight to ten times larger** than the officially-reported Coronavirus totals for New York, namely 4,750 and 25,665.

This estimate of New York infections relied upon a doubling period of six days, and it is quite possible that greater “social distancing” together with the recent lockdown imposed in that state may have considerably increased that parameter this figure, thereby reducing the correct number of infections. But I strongly suspect that the figures provided above are still far closer to the truth than those officially reported by the New York authorities. And there is obviously a huge practical difference between assuming 25,000 infected New Yorkers and believing the true figure is already closer to 200,000.

Here’s another example. The official estimate of Coronavirus infections in Louisiana is currently less than 1,400, but this estimation method suggests that the true total is nearly 50,000, a figure that has vastly different policy implications. While I would put little weight in the precise accuracy of that estimate, I strongly believe it’s much closer to reality than the tiny official total.

Summarizing things, the formula for estimating infections is:

**Number of infected = Number of Deaths / Mortality_Rate *2^(Mortality_Period/Doubling_Period).**

It’s important to recognize that the parameters used may need to be sharply readjusted based upon particular circumstances.

For example, once a health care system collapses, the death rate probably spikes to around 5%, greatly changing the calculation. Similarly, once government lockdowns or other similar measures are taken, the doubling-period of the infection becomes much longer.

Under other circumstances, if a substantial fraction of the deaths are the elderly residents of nursing homes (as was the case in Washington State), the assumed death-rate would be much higher and the doubling-period of the infections generated by the immobile residents lower, significantly altering the appropriate equation.

Finally, this analysis may carry an important policy implication. Suppose that our best and most accurate means of estimating infections is indeed based upon this methodology. We would therefore be using the current death rate to back into the infection rate that that occurred three weeks earlier, so that any analysis of the impact of government policies would necessarily be lagged by three weeks.

Under these circumstances, it would be extremely inadvisable for President Trump or other government officials to rescind any of their lockdown or quarantine decisions until at least three weeks had gone by and the impact of these policies upon the rate of infection became fully apparent.