What Happened to Normal Approximations?

Published

January 26, 2026

This semester I will be taking STAT 210B, which has the generic name “Theoretical Statistics”. I predict the course will go through empirical process theory and topics like VC-dimension, concentration inequalities, and the symmetrization lemma.

My instructor, Prof. Nikita Zhivotovskiy, remarked in the first day of class, that the class will have a lens more toward finite sample results rather than asymptotics. Previoulsy, the class had a few more lectures dedicated to asympotitc statistics. He claimed that in the current day using finite sample results is more intuitive. Most theorems in ML conference papers use finite sample results to prove their theorems. Generally, he claimed thinking in terms of asymptotics is an old school approach in statistics, so much so that one of his asymptotic papers couldn’t find reviewers.

So, what happened to asymptotics? Prof. Lucien LeCam, an early professor of Statistics at Berkeley, is often given the title: “father of modern asymptotics”. His fundamental result of Local Asymptotic Normality is not taught in the PhD curriculum. I’m curious if it ever was. But if I described to him the size of datasets today, would he be surprised that asymptotics has fallen by the wayside?

A defense of concentration inequalities is that datasets did not only grow longer, they also grew wider. This means that for a dataset described by a matrix with n rows and p columns, n grew but so did p. And many modern results consider the “regime” where \(\frac{n}{p} \to \alpha \in (0,\infty)\). The tools of LeCam were not designed to deal with diagonally growing datasets, only taller growing ones.

I will concede that concentration inequalities are well suited for diagnolly growing datasets. However, even when datasets grow only taller and not wider, there is a culture that says a researcher should use concentration inequalities. For instance, research in Bandits assume bounded rewards and then apply Hoefdding’s ad nauseam. At the end, they say “Machine learning uses lots of data, lets take \(n \to \infty\)”.

A result I came across did this. It used concentration inequalities then let \(n \to \infty\). I wondered could we first let \(n \to \infty\) appealing to normality, and then do the analysis. I did and the result was the same if not a bit stronger. Maybe I’ll circle back to this when I think if its of any importance.

But I think appealing to normality is nice and under appreciated. I have a belief that most things in nature with large enough datasets (especially now) are somewhat normal. I would recommend people ask “can I appeal to normality?” and “what happens if I do?”. Morally the results should be about the same, and if they’re not that’s interesting! There’s much more to this, but I think normal approximations have the same fate linear regression does. Because it is taught in STAT 101, nobody likes to use it. But in reality, it’s pretty good and for most cases it is the best.