Figures from the History of Probability and Statistics


John Aldrich, University of Southampton, Southampton, UK. (home)

June 2005. Latest changes October 2012



Notes on the work of







Bernoulli (Jakob)



von Mises


Bernoulli (Daniel)



de Moivre



de Finetti





























A further 200+ individuals are mentioned below. Use Search on your browser to find the person you are interested in. It is also worth searching for the ‘principals’ for they can pop up anywhere.


The entries are arranged chronologically, so the document can be read as a story.  These are the date markers














with people placed according to date of their first impact. Do not take the placings too seriously and remember that a career may last more than 50 years! At each marker there are notes on developments in the following period. There is more about Britain and about economics than there should be but I know more about them.


For further on-line information there are links to


·         Earliest Uses (Words and Symbols) for details (particularly detailed references) on the topics to which the individuals contributed. (The Words site is organised by letter of the alphabet. See here for a list of entries)

·         MacTutor for fuller biographical information on the ‘principals’ (all but three) and on a very large ‘supporting’ cast. The MacTutor biographies also cover the work the individuals did outside probability and statistics. The MacTutor and References links are to these pages. There is an index to the Statistics articles on the site.

·         ASA Statisticians in History for biographies of mainly recent, mainly US statisticians.

·         Life and Work of Statisticians (part of the Materials for the History of Statistics site) for further links, particularly to original sources.

·         Oscar Sheynin’s Theory of Probability: A Historical Essay An account of developments to the beginning of the twentieth century, particularly useful for its coverage of Continental work on statistics.

·         Isaac Todhunter’s classic from 1865 A History of the Mathematical Theory of Probability : from the Time of Pascal to that of Laplace for detailed commentaries on the contributions from 1650-1800. The coverage is extraordinary and the entries are still interesting—even their humourlessness has a certain charm.

·         The Mathematics Genealogy Project, abbreviated MGP, which is useful for tracking modern scholars. The PhD degree is a relatively recent development and in the UK a very recent one. See my The Mathematics PhD in the UK.

·         Wikipedia for additional biographies. This is an uneven site but it has some useful articles.


The entries contain references to the following histories and books of lives. See below for more literature.

·         Ian Hacking The Emergence of Probability, Cambridge, Cambridge University Press 1975. (contents)

·         Stephen M Stigler The History of Statistics: The Measurement of Uncertainty before 1900, Cambridge, MA: Harvard University Press 1986. (contents + bibliography)

·         Anders Hald A History of Probability and Statistics and their applications before 1750, New York: Wiley 1990. (contents)

·         Anders Hald A History of Mathematical Statistics from 1750 to 1930, New York: Wiley 1998. (contents + bibliography)

·         Jan von Plato Creating Modern Probability, Cambridge: Cambridge University Press, 1994. (contents)

·         Leading Personalities in Statistical Sciences from the Seventeenth Century to the Present, (ed. N. L. Johnson and S. Kotz) 1997. New York: Wiley. Contains around 110 biographies and based on entries in Encyclopedia of Statistical Science (ed. N. L. Johnson and S. Kotz.) Abbreviated LP.

·         Statisticians of the Centuries (ed. C. C. Heyde and E. Seneta) 2001. New York: Springer. Contains 105 biographies. The coverage is restricted to individuals born before 1900. Abbreviated SC.

·         Encyclopedia of Social Measurement (ed. K. Kempf-Leonard) 2004. New York: Elsevier (description). Contains numerous biographies. Abbreviated ESM.



On the web (see also online biblios and texts below)

·         Stochastikon Encyclopedia. Articles in English and German.

·         Portraits of Statisticians on the Materials for the History of Statistics site.

·          Sources in the History of Probability and Statistics by Richard J. Pulskamp.

·         Tales of Statisticians. Vignettes by E. Bruce Brooks.

·         History of Statistics and Probability 18 short biographies from University of Minnesota Morris.

·         Glimpses of the Prehistory of Econometrics. Montage by Jan Kiviet.

·         Probability and Statistics Ideas in the Classroom: Lesson from History.  Comments on the uses of history by D. R. Bellhouse.

·         The History of Statistics in the Classroom.  Thumbnail sketches of Gauss, Laplace and Fisher by H. A. David.

·         Milestones in the History of Thematic Cartography, Statistical Graphics, and Data Visualization. Encyclopaedic coverage by M. Friendly & D.J. Denis.  

·         Actuarial History. A very comprehensive collection of links, created by Henk Wolthuis, not only to actuarial science and demography, but to statistics as well.


To help place individuals I have used modern terms for occupation (e.g. physicist or statistician). For the earlier figures these terms are anachronistic but, I hope, not too misleading. I have not given nationality as people move and states come and go. MacTutor has plenty of geographical information.



1650-1700  The origins of probability and statistics are usually found in this period, in the mathematical treatment of games of chance and in the systematic study of mortality data. This was the age of the Scientific Revolution and the biggest names, Galileo (Materials and Todhunter ch.I (4-6).) and Newton (LP) gave some thought to probability without apparently influencing its development. For an introduction to the Scientific Revolution, see Westfall’s Scientific Revolution (1986).

·    There were earlier contributions to probability, e.g. Cardano (1501-76) gave some ‘probabilities’ associated with dice throwing, but a critical mass of researchers (and results) was only achieved following discussions between Pascal and Fermat and the publication of the first book by Huygens.  Hacking Chapters 1-5 discusses thinking before Pascal. James Franklin’s The Science of Conjecture: Evidence and Probability Before Pascal (2001) examines this earlier work in depth. A recent issue of the JEHPS  is devoted to Medieval probabilities.

·    Statistics in the form of population statistics was created by Graunt. Graunt’s friend William Petty gave the name Political Arithmetic to the quantitative study of demography and economics. Gregory King was an important figure in the next generation. However the economic line fizzled out. Adam Smith, the most influential C18 British economist, wrote,  “I have no great faith in political arithmetic...” Wealth of Nations (1776) B.IV, Ch.5, Of Bounties.

·    A form of life insurance mathematics was developed from Graunt’s work on the life table by the mathematicians Halley, Hudde and de Witt. Many later ‘probabilists’ wrote on actuarial matters, including de Moivre, Simpson, Price, De Morgan, Gram, Thiele, Cantelli, Cramér and de Finetti. In the C20 the Skandinavisk aktuarietidskrift and the Giornale dell'Istituto Italiano degli Attuari were important journals for theoretical statistics and probability. Actuarial questions and friendship with the actuary G. J. Lidstone stimulated the Edinburgh mathematicians, E. T. Whittaker and A. C. Aitken (MGPP), to contribute to statistics and numerical analysis. The C17 work is discussed by Hacking (1975): Chapter 13, Annuities. See also Chris Lewin’s The Creation of Actuarial Science and the other historical links on the International Actuarial Links page. There are historical articles in the Encyclopedia of Actuarial Science. Classics are reprinted in History of Actuarial Science. There is a nice review of the early literature in the catalogue of the Equitable Life Archive.

·    New institutions, rather than the traditional universities, underpinned these developments. In Paris and London private discussion groups, like that of Mersenne, were forerunners of the Académie des Sciences and the Royal Society of London (archives). The latter’s Philosophical Transactions (Gallica) published many important contributions to probability and statistics, including papers by Halley, de Moivre, Bayes, Pearson, Fisher, Jeffreys and Neyman. The Berlin and St. Petersburg academies were formed a bit later. The Royal Society was a forerunner of the modern scientific society, while the continental academies were more like research institutes.


Life & Work has links to the writings of many of these people. For the period generally see Todhunter ch. I-VI (pp. 1-55) and Hald (1990, ch. 1-12). 





Blaise Pascal (1623-1662) Mathematician and philosopher. MacTutor References SC, LP.  Pascal was educated at home by his father, himself a considerable mathematician. The origins of probability are usually found in the correspondence between Pascal and Fermat where they treated several problems associated with games of chance. The letters were not published but their contents were known in Parisian scientific circles. Pascal’s only probability publication was the posthumously published Traité du triangle arithmétique (1654, published in 1665 and so after Huygens’s work); this treated Pascal’s triangle with probability applications. Pascal introduced the concept of expectation and discussed the problem of gambler’s ruin. Pascal’s wager, is now often read as a pioneering analysis of decision-making under uncertainty although it appeared, not in his mathematical writings, but in the Pensées, his reflections on religion. The last chapter of the Port-Royal Logic pp. 365ff by Pascal’s friends Arnauld and Nicole has a brief treatment of the use of probability in decision making, with an allusion to the wager. See Ben Rogers Pascal's Life & Times,  Life & Work,  A.W.F. Edwards on the triangle and Todhunter ch.II (pp. 7-21). See also Hald 1990, chapter 5, The Foundations of Probability Theory by Pascal and Fermat in 1654 and Hacking 1975, chapter 7, The Roannez Circle (1654) and chapter 8, The Great Decision (1658?). 




Christiaan Huygens (1629-94) Mathematician and physicist. MacTutor References SC, LP.  As a youth Huygens was expected to become a diplomat but instead he became a gentleman scientist, making important contributions to mathematics, physics and astronomy.  He was educated at the University of Leiden and at the College of Orange at Breda. He spent 14 years in Paris at the Académie des Sciences. Huygens wrote the first book on probability, a pamphlet really, Van Rekeningh in Spelen van Geluck, translated into Latin by his teacher van Schooten as De Ratiociniis in Ludo Aleae (1657) and then into English as The Value of all Chances in Games of Fortune etc. Huygens drew on the ideas of Pascal and Fermat, which he had encountered when he visited Paris. Much of the book is devoted to calculating the value or, as it would be called now the expectation, of a game of chance. The problems contained in the book include the gambler’s ruin and Huygens treated the hypergeometric distribution. His book was widely read and the first part of Jakob Bernoulli’s Ars Conjectandi is a commentary on it. See Life & Work and Todhunter ch.III (pp. 22-5). See also Hald (1990): Chapter 6, Huygens and De Ratiociniis in Ludo Aleae, 1657 and Hacking (1975), Chapter 11, Expectation. C.J. (Kees) Verduin is constructing an impressive Christiaan Huygens site. See Peter Doyle Hedging Huygens.


Description: H:\EC\staff\Documents and Settings\jca1\My Documents\My Received Files\Image2.jpg

No authentic portrait of Graunt is known

John Graunt (1620-74) Merchant. Wikipedia. SC, LP, ESM. Graunt is unique among the figures described here in not having had a university education. He published only one work, Observations Made upon the Bills of Mortality (1662). However, through this work and his friendship with William Petty, he became a fellow of the Royal Society of London and his work became known to savants like Halley. The weekly bills of mortality, which had been collected since 1603, were designed to detect the outbreak of plague. Graunt put the data into tables and produced a commentary on them; he does basic calculations. He discusses the reliability of the data. He compares the number of male and female births and deaths. In the course of Chapter XI on estimating the population of London Graunt produces a primitive life table see the JIA articles by Glass, Renn, Benjamin and Seal. The life table became one of the main tools of demography and insurance mathematics. Halley produced a life table using data from Caspar Neumann (SC) in Silesia. See Life & Work for writings by Graunt, Petty and Halley. See also Todhunter ch. V (pp. 37-43), Hald (1990):  Chapter 7, John Graunt and the Observations upon the Bills of Mortality, 1662 and Hacking (1975): Chapter 12, Political Arithmetic 1662.



1700-50  The great leap forward is Hald’s (1990) name for the decade 1708-1718: there were so many important contributions to such a greatly expanded subject. The roots of probability and statistics were quite distinct but by the early C18 it was understood that the subjects were closely related.

·    Jakob Bernoulli’s Ars Conjectandi, like Arnauld’s Logique (1682) pp. 365ff, suggested a conception of probability broader than that associated with games of chance. Bernoulli’s law of large numbers provided a theory to link between probability and data. See Hacking (1975): Chapters 16-7.

·    Montmort’s (SC, LP) Essay d'analyse sur les jeux de hazard (1708) and de Moivre’s Doctrine of Chances (1718) produced many new results on games of chance, greatly extending the work of Pascal and Huygens.

·    Arbuthnot’s (SC, LP) 1710 paper An Argument for Divine Providence, taken from the constant Regularity observed in the Births of both Sexes used a significance test (sign test) to establish that the probability of a male birth is not ½. The calculations were refined by 'sGravesande (LP) and Nicholas Bernoulli (LP). Apart from being an early application of probability to social statistics, Arbuthnot’s paper illustrates the close connection between theology and probability in the literature of the time. The work of John Craig provides another example.

·    Consideration of the valuation of a risky prospect, dramatised by the St. Petersburg paradox (formulated by Nicholas Bernoulli in 1713 and discussed by Gabriel Cramer) led to Daniel Bernoulli’s (1737) theory of moral expectation (or expected utility).


See Life & Work for writings by Montmort, Euler, Lagrange, etc. For the period see Todhunter ch. VII-X (pp. 56- 212) Hald (1990, ch. 1-12), Hacking Chapter 16-9.





Jakob (James) Bernoulli (1654-1705) Mathematician. MacTutor References SC, LP, ESM.  Eight members of the Bernoulli family have biographies in MacTutor (family tree) and several wrote on probability. The most important contributors were Jakob, Daniel and Niklaus. Jakob and his younger brother Johann were the first of the mathematicians. Jakob studied philosophy at the University of Basel but learnt mathematics on his own. Eventually he became professor of mathematics at Basel. The posthumously published Ars Conjectandi (title page) (1713) was his only probability publication but it was extremely influential. The first part is a commentary on Huygens’s De Ratiociniis. The work was an important contribution to combinatorics: the term permutation originated here. (The combinatorial side of probability remained important and in late C19 Britain probability and combinatorial analysis were taught together in algebra courses.) Bernoulli used the terms a priori and a posteriori to distinguish two ways of deriving probabilities (see posterior probability): deduction a priori (without experience) is possible when there are specially constructed devices, like dice but otherwise it is possible to make a deduction from many observed outcomes of similar events. Bernoulli’s theorem, or the law of large numbers was the work’s most spectacular contribution but see also the entries for morally certain and binomial distribution The eponymous Bernoulli trials, numbers and random variable all refer to this Bernoulli and his Ars Conjectandi. See Sheynin ch. 3 Life & Work (which also has links for the contributions of Niklaus and Jean III) and Todhunter ch.VII (pp. 56-77). See Stigler (1986): Chapter 2, Probabilists and the Measurement of Uncertainty. See also Hald (1990): Chapter 15, James Bernoulli and Ars Conjectandi, 1713; Chapter 16, Bernoulli’s Theorem. Sheynin has recently translated Part IV of the Ars Conjectandi. A new translation of the whole work has appeared: The Art of Conjecturing translated by Edith Dudley Scylla Amazon. The proceedings of a conference on the Ars Conjectandi are online at the JEHPS : part 1 and  part 2.



Description: deMoivre

Abraham de Moivre (1667-1754) Mathematician MacTutor References SC, LP.  De Moivre came to England from France as a refugee aged about 20 and, although he gained recognition as a mathematician and became a fellow of the Royal Society, he never obtained an academic appointment. De Moivre had read Huygens’s book before leaving France but his first paper on probability was published in 1711. In 1718 he published The Doctrine of Chances: or, a Method of Calculating the Probability of Events in Play (title page). He published other pieces on probability, putting the results into new editions of the Doctrine; the third appeared in 1733. The book began with an influential definition of probability. De Moivre obtained the normal approximation to the binomial distribution (a forerunner of the central limit theorem) and almost found the Poisson distribution. His technical innovations included the use of probability generating functions, which he used to find the distribution of the sum of uniform variables. De Moivre also wrote about life insurance mathematics when analysing annuities See survival function. Todhunter ranked de Moivre’s contributions very highly, “it will not be doubted that the Theory of Probability owes more to him than to any other mathematician, with the sole exception of Laplace.” (p. 139) Bellhouse & Genest have translated and augmented Maty’s biography of De Moivre: Statistical Science 2007 Project Euclid.  See Sheynin ch. 4 Life & Work and Todhunter ch.IX (pp. 135-93). See Stigler (1986): Chapter 2, Probabilists and the Measurement of Uncertainty. See also Hald (1990):  Chapter 22, De Moivre and the Doctrine of Chances 1718, 1738 and 1756; Chapter 25, The Life Insurance Mathematics of de Moivre and Simpson.




Daniel Bernoulli (1700-1782) Mathematician and physicist MacTutor References SC, LP.  Daniel Bernoulli, a nephew of Jakob Bernoulli, was educated at the University of Basel where his father Johann was a professor. Daniel studied medicine—his father insisted—although subsequently his father agreed to teach him mathematics. Daniel worked in St. Petersburg and at the University of Basel. He wrote nine papers on probability, statistics and demography but is best remembered for his "Exposition of a New Theory on the Measurement of Risk" (1737): his theory of choice was based on moral expectation (or expected utility). The theory had a solution for the St. Petersburg paradox, which had exposed the difference between the mathematical expectation of a prospect and its value to ‘me’: its expectation is infinite but its value to me is not. In a prize-winning paper of 1735 Bernoulli tested for the random distribution of planetary orbits. Bernoulli devised an urn model for treating the diffusion of liquids but, although it was discussed by Laplace, such probabilistic models only became common in the late C19.  Another contribution, the Essai (pp.1-45) of 1766, was an investigation of the consequences of inoculation against smallpox; see Blower 2004. This paper contained a model of epidemics of the kind that Ross and McKendrick developed in the C20 (below). Bernoulli also described the method known since Fisher as maximum likelihood in a 1777 paper “The most probable choice between several discrepant observations and the formation therefrom of the most likely induction.”  See Life & Work and Todhunter ch. IX (pp. 213-38). See Hald (1998, passim).



1750-1800  Probability established itself in physical science, in astronomy its most developed branch. The most enduring of these applications to astronomy treated the combination of observations. The resulting theory of errors was the most important ancestor of modern statistical inference, particularly of estimation theory.

·    The major mathematician/astronomers, including Daniel Bernoulli, Boscovich, Euler, Lambert, Mayer and Lagrange, treated the problem of combining astronomical observations, “in order to diminish the errors arising from the imperfections of instruments and the organs of sense” in the words of  Thomas Simpson. Simpson introduced the idea of postulating an error distribution. See Hald (1998, Part I Direct Probability, 1750-1805) and Richard J. Pulskamp’s Sources in the History of Probability and Statistics.

·    More tests of significance were developed, mainly for use in astronomy, see Daniel Bernoulli and also John Michell (1767) Crossley, who calculated the odds that the Pleiades is a system of stars and not a random agglomeration.  See Hald (1998): Part I Direct Probability, 1750-1805).

·    Interval statements about the parameter of the Binomial distribution—ancestors of the modern confidence interval—were produced by Lagrange and Laplace in the 1780s.

·    In the 1770s Condorcet started publishing on social mathematics, largely the application of probability to the decisions of juries and other assemblies. His work had a strong influence on Laplace and Poisson. Other French authors from this period included D’Alembert and Buffon; the former is best remembered for his critical remarks on probability and the latter for his needle experiment.

·    An important development in probability theory was work on conditional probability with applications to inverse probability or Bayesian inference by Bayes and Laplace. See Hald (1998): Part II Inverse Probability. 


See Todhunter ch. XI-XIX and Stigler (1986): Part I, The Development of Mathematical Statistics in Astronomy and Geodesy before 1827. For this period and the next see also Lorraine Daston (1988) Classical Probability in the Enlightenment.




No authentic portrait of Bayes is known

(for an unlikely possibility see here)

Thomas Bayes (1702-1761) Clergyman and mathematician. MacTutor References SC, LP, ESM.  Bayes attended the University of Edinburgh to prepare for the ministry but he studied mathematics at the same time. In 1742 Bayes became a fellow of the Royal Society: the certificate of election read “We propose and recommend him as a Gentleman of known Merit, well Skilled in Geometry and all parts of Mathematical and Philosophical Learning and every way qualified to be a valuable Member of the Same.” Bayes wrote only one paper on probability, the posthumously published An Essay towards solving a Problem in the Doctrine of Chances (1763). (For statement of the “problem”, see Bayes). The paper was submitted to the Royal Society by Richard Price who added a post-script of his own in which he discussed a version of the rule of succession. In the paper Bayes refers only to de Moivre and there has been much speculation as to where the problem came from. Bayesian methods were widely used in the C19, through the influence of Laplace and Gauss, although both had second thoughts. Their Bayesian arguments continued to be taught until they came under heavy attack in the C20 from Fisher and Neyman. In the 1930s and –40s Jeffreys was an isolated figure in trying to develop Bayesian methods. From the 50s onwards the situation changed when Savage and others made Bayesianism intellectually respectable and recent computational advances have made Bayesian methods technically feasible. From the early C20 there has been a revival of interest in Bayes himself and he has been much more discussed than ever before. See Bellhouse biography, Sheynin ch. 5 Life & Work and Todhunter ch.XIV (pp. 294-300). See Stigler (1986): Chapter 3, Inverse Probability and Hald (1998): Chapter 8, Bayes, Price and the Essay, 1764-1765.) There is a major new biography, A. I. Dale Most Honorable Remembrance: The Life and Work of Thomas Bayes.




Pierre-Simon Laplace (1749-1827) Mathematician and physicist MacTutor References SC, LP, ESM. Laplace wrote on probability over a period of more than 50 years. His 1774 Mémoire sur la probabilité des causes par les évènemens gave a Bayesian analysis of errors of measurement. His Théorie Analytique des Probabilités (title page) (1812 and further editions in –14, -20 and -25) was by far the biggest thing in probability yet. Laplace made many contributions, producing results like the central limit theorem (see also the normal distribution) and developing tools including the probability generating function and the characteristic function. His system was based on classical probability but the superstructure outgrew the foundations. In Britain Laplace was admired by his C19 readers including De Morgan (review of Théorie Analytique) and Edgeworth but, in the C20, his thought tended to be reduced to the popular Philosophical Essay on Probability and discussion often focussed on debateable items like the rule of succession. Fisher, for instance, had a very contracted view of Laplace’s work. Although Laplace gets far more space in Todhunter than any other author, the coverage of Laplace’s estimation theory ends in 1814. In early C20 France (see Lévy) Laplace seems also to have been largely forgotten. Hald (1998) does justice to Laplace by giving him about 400 pages and by presenting subsequent work as a series of footnotes to Laplace.  See Sheynin ch. 5 Life & Work and Todhunter ch. XX (p. 464-614). See Stigler (1986): Chapter 3, Inverse Probability; Chapter 4, The Gauss-Laplace Synthesis. Laplace’s works are available on Gallica.



1800-1830  The contrasting figures of Laplace and Gauss dominate this period. Laplace covered the entire range of probability and statistics, while Gauss treated only the theory of errors.

·    Work on the theory of errors reached a climax with the introduction of the method of least squares. The method was published by Legendre in 1805 and within twenty years there were three probability-based rationalisations, Gauss’s Bayesian argument (see uniform prior), Laplace’s argument based on the central limit theorem and Gauss’s Gauss-Markov theorem. Work continued through the C19 with numerous mathematicians and astronomers, contributing, including Cauchy (Cauchy distribution), Poisson, Fourier, Bienaymé, Bessel (probable error), Encke, Peters (Peters' method), Lüroth, Robert Ellis, Airy, Glaisher, Chauvenet and Newcomb.  (The Cauchy distribution first appeared as an awkward case for the theory of errors.) Pearson, Fisher and Jeffreys were taught the theory of errors by astronomers. In the C20 astronomers, including Eddington, Kapteyn and Charlier, also investigated the statistical properties of constellations, picking up from the middle of the C18. (above)

·     Gauss found a second important application of least squares in geodesy. Geodesists made important contributions to least squares, particularly on the computational side—not surprisingly as the calculations could be on an industrial scale. The eponyms, Gauss-Jordan and Cholesky MacTutor, honour later geodesists. Helmert (Helmert's transformation) and Paolo Pizzetti were geodesists who contributed to the theory of errors. At least one important C20 statistician started as a surveyor, Frank Yates, Fisher’s colleague and successor at Rothamsted. 

·     In Britain the first census of the population was taken in 1801. It ended a controversy about the size of the population that began when Bayes’s friend Price argued that the population had been falling in the C18. Numerous writers, including Eden, came up with their own estimates.

·    Around this time William Playfair was finding new ways of representing data graphically but nobody was paying attention. Techniques slowly accumulated over the next 150 years without the idea of graphical statistics as a study in its own right gaining ground. That idea is quite recent and mainly associated with Tukey. See Milestones.

·    The age of the academies was over and from now on the main advances took place in universities. The French education system was transformed in the course of the Revolution and the C19 saw the rise of the German university.


See Stigler (1986): Part I, The Development of Mathematical Statistics in Astronomy and Geodesy before 1827 and Hald (1998): Part III The Normal Distribution, the Method of Least Squares and the Central Limit Theorem. See Life & Work. See also L. Daston (1988) Classical Probability in the Enlightenment.





Carl Friedrich Gauss (1777-1855) Mathematician, physicist and geodesist. MacTutor References SC, LP. Gauss is generally regarded as one of the greatest mathematicians of all time and his contributions to the theory of errors were only a small part of his total output. Gauss spent most of his working life at the University of Göttingen, which became the main centre for mathematics in Germany. Initially Gauss was interested in treating astronomical observations but later he became involved in geodesy. Gauss used the method of least squares for which he gave two rationalisations. The first in The Theory of the Motion of Heavenly Bodies moving around the Sun in Conic Sections (1809) assumed normal errors and used a Bayesian argument with a uniform prior on the coefficients; the normal or Gaussian curve was derived from the principle of the arithmetic mean. Gauss departed from this Bayesian position in 1816 when he investigated the ‘efficiency’ of different estimators of precision. In the Theory of the combination of observations least subject to error (1821/3) he presented the Gauss-Markov theorem. Gauss’s way of writing the Gaussian distribution is described on Symbols associated with normal distribution; the associated terminology is described in mean error and modulus. Gauss’s influence in on the combination of observations in astronomy and geodesy was very strong. One of his followers was the astronomer Bessel (see probable error). In the early C20 Cambridge astronomers taught Gauss Mark I to Fisher and Jeffreys amongst others.  See Sheynin ch. 9  Life & Work See Stigler (1986): chapter 4, The Gauss-Laplace Synthesis and Hald (1998): Chapter 21, Gauss’s Theory of Linear Unbiased Minimum Variance Estimation, 1823-1828.



1830-1860  This period saw the emergence of the statistical society, which has been on the stage ever since, although the meaning of “statistics” has changed and the beginning of a philosophical literature on probability. It saw also the beginning of the most glamorous branch of empirical time series analysis, the sunspot cycle.

·    Since the 1830s there have been statistical societies, including the London (Royal) Statistical Society Wikipedia and the American Statistical Association (now the world’s largest). The Statistical Society of Paris was founded in 1860. The International Statistical Institute Wikipedia was founded in 1885 although there had been international congresses from 1853. Statistics were facts about human populations and in France André-Michel Guerry mapped the moral statistics of the country. Quetelet was a catalyst in the formation of the London Society but its intellectual ancestors were not the mathematicians Laplace or Condorcet, but Graunt and, more importantly, John Sinclair, Arthur Young and F. M. Eden, who collected facts about society in the interests of “improvement”. The facts were to complement, or perhaps be an antidote to, the theoretical economics of the day; see the journal’s mission statement. Florence Nightingale’s (ASA and Life & Work) later efforts to make record keeping and the statistical analysis of those records part of hospital routine belonged to this tradition. Among the founders of the society were the mathematician Babbage, who also wrote on insurance and a produced a large work Economy of Machinery & Manufacturers, the political economist “Population” Malthus, and the actuary Gompertz, known for his Law of Mortality. The technically most sophisticated work presented to the society in its first decades was William Farr’s analysis of vital statistics. Probability publications do not appear in the Society’s journal until the 1880s when Edgeworth started publishing. See Mathematics in the London/Royal Statistical Society 1834-1934.

·    Since 1840, or so, there has been a philosophical literature on probability. The English literature begins with the extensive discussion of probability in John Stuart Mill’s System of Logic (1843). This was followed by The Logic of Chance (1862) of John Venn, the Principles of Science (1873) of W. Stanley Jevons and the Grammar of Science (1892) of Karl Pearson. The American scientist/philosopher C.S. Peirce (Stanford) wrote extensively on probability, although he was not much read. There was an overlapping literature on logic and probability. De Morgan can be placed here as well as Boole (LP) whose An Investigation into the Laws of Thought, on which are founded the Mathematical Theories of Logic and Probabilities (1854) contained a long discussion of probability. Later figures are mentioned below. There were German and French literatures as well but philosophical probability was less international than mathematical probability. See Porter’s Rise of Statistical Thinking.

·    In 1843 Schwabe observed that sunspot activity is periodic. There followed decades of research, not only in solar physics but in terrestrial magnetism, meteorology and even economics examining series to see if their periodicity matched that of the sunspots. Even before the sunspot craze there was intense interest in periodicity in meteorology, tidology, and other branches of observational physics and, by the end of the century seismology, was becoming important. Both Laplace and Quetelet had analysed meteorological data and Herschel had written a book on the subject. The techniques in use varied from the simple, such as the Buys Ballot  table, to the sophisticated, forms of harmonic analysis. At the end of the century the physicist Arthur Schuster introduced the periodogram. However, by then a rival form of time series analysis, based on correlation and promoted by Pearson, Yule, Hooker and others, was taking shape. For an account see J. L. Klein (1997) Statistical Visions in Time, Cambridge: Cambridge University Press.


See I. Hacking The Taming of Chance (1990) and T. M. Porter The Rise of Statistical Thinking 1820-1900 (1986).





Lambert Adolphe Jacques Quetelet (1796-1874) Astronomer and statistician. MacTutor References SC, LP, ESM.  Adolphe Quetelet studied at the University of Ghent but his career was centred on Brussels. His great energies were first concentrated on establishing an observatory there. In 1824 he went to Paris for three months to learn about such things and met the great French scientists; Quetelet seems to have learnt about probability from Fourier.  He returned an enthusiast for probability and proposed improvements in census taking. His book introducing “social physics”, Sur l'homme et le developpement de ses facultés, essai d'une physique sociale (1835), introduced the “average man” (l'homme moyen). His very widely read Letters on Probability (1846) publicised the use of the normal distribution, not as an error law, but for describing the distribution of measurements. Quetelet was a tireless scientific entrepreneur both at home and internationally, e.g. he played an important part in establishing the London Statistical Society (above). His work was not well received by the social scientists of the C19, although his initiative was admired by the economist Stanley Jevons and Galton continued his work. In the early C20 J. M. Keynes (Treatise on Probability) wrote of Quetelet, “There is scarcely any permanent, accurate contribution to knowledge, which can be associated with his name. But suggestions, projects, far-reaching ideas he could both conceive and express, and he has a very fair claim, I think, to be regarded as the parent of modern statistical method.” See Life & Work. Coven’s A History of Statistics in the Social Sciences is a study of Quetelet. See Stigler (1986): Chapter 5, Quetelet’s Two Attempts. Hald (1998) 26.3. Quetelet on the Average Man, 1835, and on the Variation around the Average, 1846.



1860-1880  Two important applied fields opened up in this period. Probability found a major new application in physical science, to the theory of gases, which developed into statistical mechanics. Problems in statistical mechanics were behind many of the probability advances of the early C20. The statistical study of heredity developed into biometry and many of the advances in statistical theory were associated with this subject. There were important geographical changes, as important work in probability started to come from Russia and important statistical work from England.

·    In 1860 James Clerk Maxwell used the error curve (normal distribution) in the theory of gases; he seems to have been influenced by Quetelet via John Herschel’s review of the Letters on Probability. Boltzmann and Gibbs developed the theory of gases into statistical mechanics.

·    Galton inaugurated the statistical study of heredity, work continued way into the C20 by Pearson and Fisher. Correlation was the most distinctive contribution of this “English” school. See Stigler (1986): Part III, A Breakthrough in Studies of Heredity.

·    By contrast, the so-called “continental direction” investigated the appropriateness of simple urn models for treating birth and death rates by considering the stability of the series of rates over time. Wilhelm Lexis Theorie der Massenerscheinungen in der menschlichen Gesellschaft (1877). Bortkiewicz Markov Chuprov and Anderson all worked in this tradition. See Stigler (1986): Chapter 6, Attempts to revive the Binomial, C. C. Heyde & E. Seneta I. J. Bienaymé: Statistical Theory Anticipated, 1977 and Sheynin ch. 15.1.

·    ‘Higher’ statistics entered psychology and economics. For psychology see Fechner. In economics W. Stanley Jevons (Wikipedia MacTutor New School) (SC) saw himself as continuing the work of the political arithmeticians of 1650+. In the intervening two centuries much had been done and Jevons’s work on index numbers was inspired by the theory of errors, while his research on economic time series was inspired by the work of meteorologists on seasonal variation and of physicists on the solar cycle and its terrestrial correlates. (see above) Jevons also tried to link his mathematical economic theory (see utility) to statistical analysis—a project revived in the econometrics of the C20.


See Stigler (1986) and T. M. Porter The Rise of Statistical Thinking 1820-1900 (1986).





Ludwig Boltzmann (1844-1906) Physicist. MacTutor References. MGP. LP. Boltzmann, with Gibbs, was responsible for transforming Maxwell’s probabilistic theory of gases into statistical mechanics. Boltzmann was awarded a doctorate from the University of Vienna in 1866 for a thesis on the kinetic theory of gases. He had appointments at the universities of Graz, Leipzig as well as Vienna. Statistical mechanics required solutions to problems in distribution theory and also generated conceptual problems.  Boltzmann gave the χ2 distribution for 2 and 3 degrees of freedom (1878) and for n (1881). Ernst Abbe LP a physicist working in the theory of errors had already obtained the distribution in 1862 and Pearson was to obtain it again 1900. One of the conceptual innovations was entropy. Zermelo argued that the irreversibility in Boltzmann’s thermodynamics contradicted the recurrence properties of dynamic systems. Boltzmann’s student Paul Ehrenfest devised the Ehrenfest urn model to show that the contradiction was only apparent. One of Ehrenfest’s students was Uhlenbeck whose 1930 paper on Brownian motion became part of the probability literature. Boltzmann thought of the proper average values to identify with macroscopic features as being averages over time of quantities calculable from microscopic states. He wished to identify the phase averages with such time averages In the 1930s Birkhoff and von Neumann produced ergodic theorems that addressed the problem. The Stanford Encyclopedia has 2 relevant articles: J. Ullfink “Boltzmann's Work in Statistical Physics” and L. Sklar “Philosophy of Statistical mechanics.”  See also von Plato passim.



Gustav Theodor Fechner (1801-1877) Wikipedia Physicist and psychologist. SC.  Fechner went to the University of Leipzig to study medicine but did not qualify as a doctor. He developed a strong interest in the relationship between mind and matter and he switched to physics; his impressive experimental work earned him a chair in physics in 1834. In the mid 50s he started the experiments that formed the basis of his Elemente der Psychophysik (Elements of Psychophysics) (1860). In his psychophysics Fechner emphasised the Weber-Fechner law relating to sensation to stimulus. Stigler emphasises the book’s contribution to experimental method and assesses the treatment of experimental design as the “most comprehensive” before Fisher’s Design of Experiments (1935). One of the techniques Fechner introduced as an ancestor of probit analysis. In all this work Fechner used probability ideas that he had known from his work in physics. His second major work was the posthumously published Kollektivmasslehre (1897). The latter anticipates the ideas of von Mises on collectives. Hermann Ebbinghaus was inspired by Fechner’s Psychophysics to start his own experimental work on memory; these researches were published in his Über das Gedächtnis (1885). See Life & Work (for both Fechner and Ebbinghaus) and Stigler (1986): Chapter 5, Psychophysics as a Counterpoint.  See also O. Sheynin (2004) Fechner as a Statistician, British Journal of Mathematical and Statistical Psychology, 57, 53-72.




Francis Galton (1822-1911) Man of science MacTutor References SC, LP, ESM. After studying mathematics at Cambridge University, medicine in Birmingham and London Galton spent some years exploring Africa; his eminence as an African explorer and geographer led to his election to the Royal Society in 1860. Galton became interested in the phenomena of heredity in the 1860s and most of his contributions to statistics arose out of that study. Apart from his books Hereditary Genius (1869) and Natural Inheritance (1889) he wrote many articles. His cousin Charles Darwin, whom he advised on statistical matters, was an important influence, as was Quetelet, although he disagreed with him on many points. (Like Darwin, Galton was rich enough to be a gentleman scholar.) The normal distribution played an important part in Galton’s work. He is most remembered for introducing the methods of correlation and regression. He often involved other, better, mathematicians, including George Darwin, Glaisher, Watson, MacAlister, Pearson, Edgeworth and Sheppard (of Sheppard's corrections), in his problems. His work on the lognormal distribution and branching processes came about from his posing problems to mathematician friends. Galton contributed a large number of terms to statistics, including many of those used in elementary statistics, e.g. ogive, percentile and inter-quartile range.  Apart from his influence on biometry, Galton had a strong influence on the development of psychology, especially in Britain, both because he wrote about psychology and because the statistical tools he developed were taken up by psychologists like Spearman. See Life & Work. Much of Galton’s vast output is available on Gavan Tredoux’s Francis Galton. There are two recent biographies: N. W. Gillham A Life of Sir Francis Galton: From African Exploration to the Birth of Eugenics (2002) and M. G. Bulmer Francis Galton: Pioneer of Heredity and Biometry (2003). See Stigler (1986): Chapter 8, The English Breakthrough: Galton.




P. L. Chebyshev (1821-94) Mathematician. MacTutor References. SC, LP.  Chebyshev was one of the most important of C19 mathematicians and probability formed only a small part of his output. He had predecessors in Russia—notably V. K. Bunyakovsky—and he also drew on the French literature but he was an original probabilist who had a great influence on the development of probability in Russia. Chebyshev studied at Moscow University but spent his working life at the University of St. Petersburg. In 1867 Chebyshev published a paper On mean values which made a great advance in proving a generalised form of the law of large numbers. It used Chebyshev's inequality (already stated by his friend Bienaymé). Chebyshev used the “method of moments” in a proof of the central limit theorem for not necessarily identically distributed random variables. He is also remembered for his work on interpolation, which led to Chebyshev polynomials. Among his students were Markov and Lyapunov (Wikipedia). See Life & Work See Hald (1997): chapter 25 Orthogonalization and Polynomial Regression. Sheynin has translated Chebyshev’s Lectures on the Theory of Probability. See also Sheynin ch. 13



1880-1900  In this period the English statistical school took shape. Pearson was the dominant force until Fisher displaced him in the 1920s. The school dominated statistics until the Second World War. T. Schweder’s Early Statistics in the Nordic Countries considers why this did not happen in Scandinavia.

·    Galton introduced correlation and a theory was rapidly developed by Pearson, Edgeworth, Sheppard and Yule. Correlation was major departure from the statistical work of Laplace and Gauss, both as a technique and because of the applications it made possible. It became widely used in biology, psychology and social science.

·    In economics Edgeworth developed some of Jevons’s ideas, most notably on index numbers. However, economic statistics in Britain was more closely tied to official statistics or financial journalism and Newmarch (1820-82) and Giffen (1837-1910) were more representative than Jevons or Edgeworth. In Italy Vilfredo Pareto discovered a statistical regularity in the distribution of income; see Pareto distribution.






F. Y. Edgeworth (1845-1926) Economist and statistician. MacTutor References. SC, LP, ESM.  Edgeworth studied classics at Trinity College Dublin and Balliol College Oxford. From around 1880 he followed dual careers in economics and in statistics. Edgeworth seems to have been self-taught in mathematics and he made a thorough study of the subject and remained very well read. He began in statistics by subjecting the casual statistical methods of Jevons to rigorous examination and started what turned out be made a long involvement with index numbers (see “Money” in his Papers relating to Political Economy, vol. 2). However, most of his extensive publications in statistical theory were not motivated by economic applications, or direct applications of any kind. In 1892 Edgeworth, prompted by Galton, examined correlation and methods of estimating correlation coefficients. Another concern, which led to a stream of papers, was with generalisations of the normal distribution, as in e.g. his 1905 paper “The law of error”. The Edgeworth expansions that came from this research are now associated with distributions of estimators and test statistics but Edgeworth originally envisaged these distributions used for data distributions, as an alternative to the Pearson curves. Edgeworth’s starting point was Laplace. Much of his work was not followed up, like his 1908/9 papers “On the probable errors of frequency-constants” which anticipated some of Fisher’s large sample theory for maximum likelihood. Unlike his contemporaries, Pearson in statistics and Alfred Marshall in economics, Edgeworth founded no school. From 1891 he was professor of political economy at Oxford. He had no students in Statistics and his only follower was Arthur Bowley, whose reputation rests on work in economic statistics and social surveys. See Life & Work and Francis Ysidro Edgeworth. See Stigler (1986): Chapter 9, The Next Generation Edgeworth. His statistics papers are collected in the 3-volume set F. Y. Edgeworth, Writings in Probability, Statistics, and Economics edited by Charles Robert McCann, Jr.





Karl Pearson (1857-1936) Biometrician, statistician & applied mathematician. MacTutor References. SC, LP, ESM. Karl Pearson read mathematics at Cambridge but made his career at University College London. Pearson was an established applied mathematician when he joined the zoologist W. F. R. Weldon and launched what became known as biometry; this found institutional expression in 1901 with the journal Biometrika. Weldon had come to the view that “the problem of animal evolution is essentially a statistical problem” and was applying Galton’s statistical methods. Pearson’s contribution consisted of new techniques and eventually a new theory of statistics based on the Pearson curves, correlation, the method of moments and the chi square test. Pearson was eager that his statistical approach be adopted in other fields and amongst his followers was the medical statistician Major Greenwood. Pearson created a very powerful school and for decades his department was the only place to learn statistics. Yule, Irwin, Wishart and F. N. David were among the distinguished statisticians who started their careers working for Pearson. Among those who attended his lectures were the biologist Raymond Pearl, the economist H. L. Moore, the medical statistician Austin Bradford Hill and Jerzy Neyman; in the 1930s Wilks was a visitor to the department. In France Lucien March was a follower. Pearson’s influence extended to Russia where Slutsky (see minimum chi-squared method) and Chuprov were interested in his work. Pearson had a great influence on the language and notation of statistics and his name often appears on the Words pages and Symbols pages—see e.g. population, histogram and standard deviation. When Pearson retired, his son E. S. Pearson inherited the statistics part of his father’s empire—the eugenics part went to R. A. Fisher. Under ESP (who retired in 1961) and his successors the department continued to be a major centre for statistics in Britain. M. S. Bartlett went there as a lecturer after graduating from Cambridge in 1933 (his teacher was Wishart) and again as a professor when ESP retired. For more on KP see Karl Pearson: A Reader’s Guide. See Stigler (1986): Chapter 10, Pearson and Yule.



1900-1920  In the years before the Great War of 1914-18 probability and statistics were expanding in all directions. During the war research in statistics and probability almost stopped as people went into the armed services or did other kind of war work. Pearson, Lévy and Wiener worked in ballistics, Jeffreys in meteorology and Yule in administration. For the mathematicians’ traditional role in war, see The Geometry of War.

·    In 1900 David Hilbert proposed a set of problems for the C20. The 6th was, “to treat … by means of axioms, those physical sciences in which mathematics plays an important part; in the first rank are the theory of probabilities and mechanics.” Measure theory which would have a key role in the axiomatisation of probability was being created by Borel, Lebesgue and others—see below. 

·    From different subjects came contributions that eventually found a place in the theory of stochastic processes. In physics Einstein and Smoluchovski (see Cohen’s History of Noise) worked on Brownian motion. Bachelier (see Bru & Taqqu) developed a similar model applied to financial speculation—that application was a sleeper until the 1970s. The actuary Lundberg developed a theory of collective risk.  Malaria and the migration of mosquitoes were behind Pearson’s interest in the random walk problem. Mathematical models of epidemics were developed by Ronald Ross and A. G. McKendrick MacTutor without reference to the earlier work of Daniel Bernoulli.

·    Mendel did not use probability in his work on genetics (published 1866) but his ideas were probabilised as Pearson, Yule and Fisher investigated how far his principles could rationalise the findings of the biometricians.  

·    Correlation began to be important in psychology, largely through Charles Spearman (1863-1945). Amongst his contributions to statistics were rank correlation and factor analysis. Godfrey Thomson was a severe critic of Spearman’s factor analysis of intelligence. In the 1930s Louis L. Thurstone developed a multiple factor analysis.

·    In economics, especially in the United States, quantitative methods become more prominent. The most important figures were Warren Persons, Irving Fisher, Wesley Mitchell and H. L. Moore. Most of their work would now be classified as time series analysis.

·    Industrial applications of probability begin with Erlang’s work on congestion in telephone systems, the ancestor of modern queuing theory 

·    Institutional developments include the creation in 1911 of the Department of Applied Statistics at UCL headed by Pearson. Also in London, at the London School of Economics, Bowley became the first (full-time) Professor of Statistics in Britain. At Cambridge a University Lectureship in Statistics was created in 1912. Yule, who got the job, might be called the first modern statistician—his expertise was in statistics (rather than in mathematics more broadly or in science like astronomy or biology) and he applied this expertise to anything that interested him.


See Hald (1998, Part IV) and von Plato (ch. 3) “Probabilities in Statistical Physics.” For developments in economics see M. S. Morgan A History of Econometric Ideas, Cambridge 1990.





G. Udny Yule (1871-1951) Statistician. MacTutor References. Wikipedia SC, LP, ESM.  After training as an engineer at University College London and studying in Germany, Yule returned to work for Karl Pearson in 1893. He was soon contributing to the theory of correlation and regression and after 1900 he developed a parallel theory of association for attributes. Yule applied Pearson’s statistical techniques to social problems—see e.g. the entry on Pearson curves—and, with Edgeworth and Bowley, he was one of the few members of the Royal Statistical Society interested in mathematical statistics. Yule had broad interests and his collaborators included the agricultural meteorologist R. H. Hooker, the medical statistician Major Greenwood (see negative binomial distribution for their study of the incidence of disease and accidents) and the agriculturalist (Sir) Frank Engledow. Yule’s sympathy towards the newly rediscovered Mendelian theory of genetics led to several papers; one involved developing the minimum chi-squared method of estimation.  In the 1920s he wrote important papers on time series analysis: “On the time-correlation problem” (1921) was a critique of the variate difference method; “Why Do We Sometimes Get Nonsense Correlations between Time-series?” (1926) investigated a form of spurious correlation; “On a Method of Investigating Periodicities in Disturbed Series, with Special Reference to Wolfer's Sunspot Numbers” (1927) used an autoregressive model in place of the usual periodic trend of harmonic analysis; see periodogram analysis and Yule-Walker equations. (above) Yule taught at Cambridge for nearly 20 years yet he seems to have had little influence on the students there. His successor, John Wishart, had more impact. Although Cambridge was the outstanding centre in Britain for training mathematicians, statistics was slow in becoming established there: see Cambridge history. Yule’s Introduction to the Theory of Statistics (1910) was influential and widely-used, especially after it was updated by Maurice Kendall in 1937.  See Stigler (1986): Chapter 10, Pearson and Yule.




A. A. Markov (1856-1922) Mathematician. MacTutor References. SC, LP.  Markov spent his working life at the University of St. Petersburg. Markov was, with Lyapunov, the most distinguished of Chebyshev’s students in probability. Markov contributed to established topics such as the central limit theorem and the law of large numbers. It was the extension of the latter to dependent variables that led him to introduce the Markov chain. He showed how Chebyshev’s inequality could be applied to the case of dependent random variables. In statistics he analysed the alternation of vowels and consonants as a two-state Markov chain and did work in dispersion theory. Markov had a low opinion of the contemporary work of Pearson, an opinion not shared by his younger compatriots Chuprov and Slutsky. Markov’s Theory of Probability was an influential textbook. Markov influenced later figures in the Russian tradition including Bernstein and Neyman. The latter indirectly paid tribute to Markov’s textbook when he coined the term Markoff theorem for the result Gauss had obtained in 1821; it is now known as the Gauss-Markov theorem. J. V. Uspensky’s Introduction to Mathematical Probability (1937) put Markov’s ideas to an American audience. See Life & Work There is an interesting volume of letters, The Correspondence between A.A. Markov and A.A. Chuprov on the Theory of Probability and Mathematical Statistics ed. Kh.O. Ondar (1981, Springer) See also Sheynin ch. 14 and G. P. Basharin et al.  The Life and Work of A. A. Markov. 



Description: William Sealy Gosset

‘Student’ = William Sealy Gosset (1876-1937) Chemist, brewer and statistician. MacTutor References.  Wikipedia SC, LP. Gosset was an Oxford-educated chemist who worked not in a university but for Guinness, the Dublin brewer. Gosset taught himself the theory of errors from the textbooks by Mansfield Merriman and Airy. His career as a publishing statistician began after he studied for a year with Karl Pearson. In his first published paper ‘Student’ (as he called himself) rediscovered the Poisson distribution. In 1908 he published two papers on small sample distributions, one on the normal mean (see Student's t distribution and Studentization) and one on normal correlation (see Fisher’s z-transformation). Although Gosset’s fame rests on the normal mean work, he wrote on other topics, e.g. he proposed the variate difference method to deal with spurious correlation. His work for Guinness and the farms that supplied it led to work on agricultural experiments. When his friend Fisher made randomization central to the design of experiments Gosset disagreed—see his review of Fisher’s Statistical Methods. Gosset was not very interested in Pearson’s biometry and the biometricians were not very interested in what he did; the normal mean problem belonged to the theory of errors and was more closely related to Gauss and to Helmert than to Pearson. Gosset was a marginal figure until Fisher built on his small-sample work and transformed him into a major figure in C20 statistics; for his relations with Fisher, see Fisher Guide. Another admirer was E. S. Pearson. For a sample of Gosset’s humour see the entry kurtosis. See Life & Work



1920-1930 Many of the people who would dominate probability and statistics over the following decades, first made an impact. Of them, the individual who had the greatest hold over his subject was Fisher in statistics. The ascendancy of Fisher was also the ascendancy of the English language. While German was the international scientific language of the time—and the language of probability—Fisher and his followers rarely referred to literature in German, believing that that literature ended with Gauss, although Helmert was later added to the canon. Thus standard works, like those by Czuber, did not cross the Channel.

·    In probability advances included refinements of the central limit theorem (here Lindeberg made an important contribution) and the strong law of large numbers (which went back to Borel in 1909 and Cantelli in 1917) and new results including the law of the iterated logarithm. There were contributions from most countries of Continental Europe, e.g. Mazurkiewicz from Poland, Onicescu from Romania and Hostinsky from Czechoslovakia. In Britain, however, there was much less interest in probability: see England and Continental Probability in the Inter-War Years and the remarkable case of Alan Turing who repeated Lindeberg’s work, being unaware of it. Nevertheless the examiners, Fisher and Abram Besicovitch who had studied with Markov, were impressed by his work and Turing was elected a fellow of his college. 

·    The foundations of probability received much attention and certain positions found classic expression: the logical interpretation of probability (degree of reasonable belief) was propounded by the Cambridge philosophers, W. E. Johnson, J. M. Keynes and C. D. Broad, and presented to a scientific audience by Jeffreys;  Keynes was influenced by the German physiologist/philosopher J. von Kries. The frequentist view was developed by von Mises.

·    The Modern (Evolutionary) Synthesis of Mendelian genetics and Darwinian natural selection involved the solution of problems involving stochastic processes, e.g. branching processes. However, the work did not have as much influence on the development of probability theory as similar work in physics; see Fokker-Planck equation. The principal contributors to the modern synthesis, Fisher, J. B. S. Haldane and Sewall Wright (path analysis), all contributed to statistics, but Fisher was in a class apart.

·    In statistics R. A. Fisher generated many new ideas on estimation and hypothesis testing and his work on the design of experiments moved that topic from the fringes of statistics to the centre. His Statistical Methods for Research Workers (1925) was the most influential statistics book of the century.

·    W. A. Shewhart ASQ Wikipedia pioneered quality control, which became a major industrial application of statistics.


See Hald (1998, ch. 27 and passim) and von Plato (ch. 4-6)




Description: using a calculator

By permission of Fisher Memorial Trust

R. A. Fisher (1890-1962). Statistician and geneticist. MacTutor References. SC, LP, ESM. Fisher was the most influential statistician of the C20. Like Pearson, Fisher, studied mathematics at Cambridge University. He first made an impact when he derived the exact distribution of the correlation coefficient (see Fisher’s z-transformation). Although the correlation coefficient was a cornerstone of Pearsonian biometry, Fisher worked to synthesise biometry and Mendelian genetics; for Fisher’s many disagreements with Pearson, see Pearson in A Guide to R. A. Fisher. In 1919 Fisher joined Rothamsted Experimental Station and made it the world centre for statistical research. His subsequent more prestigious appointments in genetics at UCL and Cambridge proved less satisfying. The estimation theory Fisher developed from 1920 emphasised maximum likelihood and was founded on likelihood and information. He rejected Bayesian methods as based on the unacceptable principle of indifference; see here. In the 1930s Fisher developed a conditional inference approach to estimation based on the concept of ancillarity. His most widely read work Statistical Methods for Research Workers (1925 + later editions) was largely concerned with tests of significance: see Student's t distribution, chi square, z and z-distribution and p-value. The book also publicised the analysis of variance and redefined regression. The Design of Experiments (1935 + later editions) put that subject at the heart of statistics (see randomization, replication blocking). The fiducial argument, which Fisher produced in 1930, generated much controversy and did not survive the death of its creator. Fisher created many terms in everyday use, e.g. statistic and sampling distribution and so there are many references to his work on the Words pages. See Symbols in Statistics for his contributions to notation. Fisher influenced statisticians mainly through his writing—see the experience of Bose and Youden. Among those who worked with him at Rothamsted were Irwin Wishart, Yates (colleagues) and Hotelling (‘voluntary worker’) Speed Hotelling lecture  MGP. Fisher made several visits to the US where Hotelling and Snedecor were important contacts. In London and Cambridge Fisher was not in a Statistics department and Rao was his only PhD student in Statistics. For more information see A Guide to R. A. Fisher. See Hald (1998, ch. 28 Fisher’s Theory of Estimation 1912-1935 and his Immediate Precursors).




Paul Lévy (1887-1971) Mathematician. MacTutor References. MGP. LP.  In the late C19 French mathematicians continued to work on probability but Bertrand and Poincaré made no advances comparable to those made by Laplace and his contemporaries, nor did they conserve the rich tradition. Major mathematicians, including Borel (see normal number) and Fréchet, wrote on probability in the early C20 but Lévy became the leading French probabilist. Lévy was originally interested in analysis (see functional analysis) and he only started publishing on probability around 1920. In 1920 Lévy was appointed Professor of Analysis at the École Polytechnique, a position he held until 1959. He revived characteristic function methods (the name is his) and used them in his work on the stable laws and the central limit theorem. This work was summarised in his influential Calcul des Probabilités (1925). In the 1930s he focussed on the study of stochastic processes, in particular martingales and Brownian motion. Théorie de l'addition des variables aléatoires (1937) Processus stochastiques et mouvement brownien (1948). See also Annales (including nice photos) and Rama Cont.  See also von Plato passim. Paul Lévy, Maurice Fréchet : 50 ans de correspondance mathématique (eds. Barbut, Locker & Mazliak) includes a review of Lévy’s work in probability. Lévy’s student Michel Loève moved to Berkeley (see Neyman) and had several distinguished students of his own; see MGP. Another important figure who attended Lévy’s lectures was Benoit Mandelbrot. See Life & Work for works by Bertrand and Poincaré. Mazliak’s Borel, Fréchet, Darmois et la statistique dans les années 20 describes French work in statistical theory.





Richard von Mises (1883-1953) Applied mathematician. MacTutor. References.  MGP. SC, LP.  Mises was educated at the Technische Hochschule in Vienna. From 1919 he was director of the Institute of Applied Mathematics at the University of Berlin. He used probability in his work in physics, e.g. von Mises distribution, and wrote on mainstream probability topics, e.g. central limit theorem, but he is most famous for his work on the foundations of probability. In 1919 he published his “Grundlagen der Wahrscheinlichkeitsrechnung,” (p. 52) which expounded his frequentist interpretation of probability, based on the notion of a collective. The paper contained other innovations, including the “label space” (sample space) and the distribution function. At the time the Mathematische Zeitschrift, which published these early papers, was the most important German outlet for work in probability. Von Mises published two books on probability, the widely read Probability, Statistics and Truth (1928) and a comprehensive textbook (1931). His position on statistical inference was—surprisingly—Bayesian. In 1933 he left Germany for Turkey and, in 1938, moved again to the United States, where he became Professor of Aerodynamics and Applied Mathematics at Harvard. Mises influenced many writers on probability in the 20s and 30s, including Kolmogorov. Among those who worked to make the collective rigorous in the 1930s were Wald and the logician Alonzo Church.  Hilda Geiringer has written a history of probability from a Misean standpoint, see Probability: Objective Theory. See also von Plato (ch. 6) Von Mises’ frequentist probabilities.




Harold Jeffreys (1891-89) Applied mathematician and physicist. MacTutor References. SC, LP.  Jeffreys has a good claim to be considered the first Bayesian statistician in that he used only Bayesian methods. Jeffreys arrived to study mathematics at Cambridge University a year after Fisher and he spent his life there working on astronomy and geophysics. Unlike von Mises, Jeffreys was not primarily interested in probability as a means of modelling physical processes but in probability in relation to scientific inference. With Dorothy Wrinch, he produced a series of papers between 1919 and –23. (See the entries Bayes and posterior probability.) They were influenced by the approach to probability taken by the Cambridge philosophers W. E. Johnson, J. M. Keynes and C. D. Broad; all would now be described as Bayesians. From his earliest work Jeffreys had used least squares (which he had learnt from the astronomer Eddington) in his empirical work but around 1930 he started to devise new methods and to reconstruct the old in accordance with his theory of probability. He did extensive empirical work, the best known being in collaboration with K E Bullen on earthquake travel times. Jeffreys also studied Fisher’s statistical work and adopted some of his concepts and terminology, e.g. likelihood. Jeffreys’s big book Theory of Probability (1939) combined a philosophy of probability with a reworking of the “modern statistics” of Fisher and Pearson—all founded on the principles of inverse probability. In 1946 Jeffreys completed his system by providing a rule for choosing priors—see Jeffreys prior. Statisticians (including those who attended his lectures as students!) showed little interest in Jeffreys’s work until the Bayesian revival of the 1960s. The physicist E. T. Jaynes (1922-98) was strongly influenced by him. See Symbols in Probability for Jeffreys’s contribution to notation. See Life & Work.  For more information see Harold Jeffreys as Statistician.



Norbert Wiener (1894-1964) Mathematician. MacTutor References.  MGP. SC, LP.  Wiener’s working life was spent at MIT. He was well-travelled, having studied at Harvard, Cambridge University and Göttingen. He studied mathematical logic at Cambridge with Russell but he was to become closer to the Cambridge analysts especially Hardy and Paley with whom he collaborated. Among contemporary probabilists his strongest links were with Lévy. Wiener earliest work on probability treated Brownian motion (see also Wiener process) where he used the new Daniell integral; see here for a detailed account of his relations with Daniell. In 1930 Wiener presented a generalized harmonic analysis, which had a mathematical model of the spectrum, and developed the periodogram analysis introduced by the physicist Arthur Schuster at the end of the C19. (above) Much of Wiener’s work ran parallel to that of the Russian probabilists Khinchin and Kolmogorov but the relationship between their work and his did not emerge until later. Wiener worked with engineers, in particular with Y. K. Lee; see The Lee-Wiener Legacy. In the Second World War Wiener developed a theory of prediction to be used in fire control systems. His wartime report was published as Extrapolation, Interpolation and Smoothing of Stationary Time Series (1949) (see filter and autocorrelation). Wiener devised the subject of cybernetics as an umbrella to cover his various interests. P. R. Masani’s biography Norbert Wiener (1990) is reviewed in Mathematical Reviews. Wiener’s papers are at MIT.




Aleksandr Yakovlevich Khinchin (1894-1959) Mathematician.  MacTutor References MGP Publications. Khinchin was a student at Moscow State University and spent almost all his working life there. Khinchin, like Lévy and Doob, started in analysis. The university had a very strong analysis group and Khinchin’s supervisor was Luzin. There was no tradition of work in probability until, that is, Khinchin and Kolmogorov created one. There do not seem to have been any personal links with the Chebyshev/Markov tradition at St. Petersburg. Khinchin was drawn into probability through an interest in the theory of numbers. The law of the iterated logarithm (1924) had an existence in number theory, as did the strong law of large numbers. While in the 20s Khinchin worked on sequences of independent random variables, in the 30s he developed the theory of stochastic processes and, in particular, that of stationary processes. In the 1940s he applied his probabilistic techniques to the theory of statistical mechanics in a book Mathematical Principles of Statistical Mechanics (1943). In the 50s Shannon's information theory (1948) was generating much more interest in probability and statistics circles than had earlier work on communication.  Khinchin contributed to the absorption of these concepts with his Mathematical Foundations of Information Theory. 



1930-1940 Against a calamitous economic and political background there were important developments in probability, statistical theory and applications. In the Soviet Union mathematicians fared better than economists or geneticists and in the early years they could travel abroad and publish in foreign journals; thus Kolmogorov and Khinchin published in the main German periodical, Mathematische Annalen GDZ. In Germany Jews were barred from academic jobs from 1934.

·    In probability the main developments were Kolmogorov’s axiomatisation of probability and the development of a general theory of stochastic processes by him and Khinchin. This work is usually seen as marking the beginning of modern probability. See von Plato (ch. 7) “Kolmogorov’s measure theoretic probabilities.” Most of the activity was in the Soviet Union and France but the United States began to play a bigger role in the course of the decade.

·    In the foundations of probability Bruno de Finetti and Frank Ramsey’s (1903-1930) (St. Andrews, N.-E. Sahlin) work on subjective probability appeared. Ramsey started by criticising the Cambridge logical school (see Jeffreys), in particular Keynes.  A statistical superstructure came only later. Jeffreys gave a complete treatment of statistics founded on his logical notion of probability but otherwise the prevailing approach was classical.

·    In Britain and the United States statistics was redefined. The Royal Statistical Society (above) broadened its political arithmetic and ‘state-istics’ agenda and welcomed work on agriculture and industry and on mathematical statistics. There were similar changes in the American Statistical Association (above). Biometrika stopped publishing biological research and focussed on theoretical statistics. The Institute of Mathematical Statistics was founded in 1930 and its journal The Annals of Mathematical Statistics appeared in 1933. This became a major journal for both mathematical statistics and probability. The first statistics laboratory in the US was created at Iowa State by Snedecor in 1933. Snedecor was strongly influenced by Fisher. In France Georges Darmois and Daniel Dugué came under Fisher’s influence. Georg Rasch took Fisher’s ideas to Denmark.

·    In statistical inference the main development was the Neyman-Pearson theory of hypothesis testing from 1933 onwards. Multivariate analysis became an identifiable subject, formed out of such contributions as the Wishart distribution (1928), Harold Hotelling’s principal components (1933) and canonical correlation (1936) and Fisher’s discriminant analysis (1936).  

·    Applications of mathematics and statistics to economics came together in the econometric movement. This could look back to the C17 political arithmetic and the C19 work on index numbers and on Pareto's law but econometric modelling, which involved the application of regression methods to economic data, was a C20 development. Among the leaders in the 1930s were Jan Tinbergen and Ragnar Frisch. Econometricians who have been followed them as Nobel laureates in economics include Engle, Granger, Haavelmo, Heckman, Klein, McFadden. Equally important were developments in the collection of economic information. In the United States the outstanding economic statistician was Simon Kuznets. See M. S. Morgan A History of Econometric Ideas, Cambridge 1990.





Andrei Nikolaevich Kolmogorov (1903-87) Mathematician. MacTutor References. MGP. LP.   Kolmogorov was one of the most important of C20 mathematicians and although he wrote a lot of probability it was only a small part of his total output. Like Khinchin, he was a student of Luzin at Moscow State University. In 1924 Kolmogorov started working with Khinchin and they produced results on the law of the iterated logarithm and the strong law of large numbers. Kolmogorov’s most famous contribution to probability was the Grundbegriffe der Wahrscheinlichkeitsrechnung (1933), (English translation) which presented an axiomatic foundation.  This made possible a rigorous treatment of stochastic processes. His 1931 paper “Analytical methods in probability theory” laid the foundations for the theory of markov processes; this paper contains the Chapman-Kolmogorov equations.  In 1941 Kolmogorov developed a theory of prediction for random processes, parallel to that developed by Wiener. In the 60s Kolmogorov returned to von Mises’s theory of probability and developed it in the direction of a theory of algorithmic complexity; this work was continued by the Swedish mathematician P. Martin-Löf. In statistics he contributed the Kolmogorov-Smirnov test. From 1938 Kolmogorov was associated with the Steklov Mathematical  Institute. He had many students, among them Gnedenko and Dynkin. Skorokhod was a younger collaborator. See also Symbols in Probability  Life & Work. See von Plato (ch. 7) “Kolmogorov’s measure theoretic probabilities”. See also Vovk & Shafer Kolmogorov’s Contributions to the Foundations of Probability and The Origins and Legacy of Kolmogorov’s Grundbegriffe—the published version of the latter (Statistical Science (2006) Number 1, 70-98) is different again.




Jerzy Neyman (1894-1981) Statistician. MacTutor References. NAS ASA  MGP. SC, LP, ESM. Neyman was educated in the tradition of Russian probability theory and had a strong interest in pure mathematics. His probability teacher at Kharkov University was S. N. Bernstein. Like many, Neyman went into statistics to get a job, finding one at the National Institute for Agriculture in Warsaw. He appeared on the British statistical scene in 1925 when he went on a fellowship to Pearson’s laboratory. He began to collaborate with Pearson’s son Egon Pearson and they developed an approach to hypothesis testing, which became the standard classical approach. Their first work was on the likelihood ratio test (1928) but from 1933 they presented a general theory of testing, featuring such characteristic concepts as size, power, Type I error, critical region and, of course, the Neyman-Pearson lemma. More of a solo project was estimation, in particular, the theory of confidence intervals. In Poland Neyman worked on agricultural experiments and he also contributed to sample survey theory (see stratified sampling and Neyman allocation). At first Neyman had good relations with Fisher but their relations began to deteriorate in 1935; see Neyman in A Guide to R. A. Fisher. From the late 1930s Neyman emphasised his commitment to the classical approach to statistical inference. Neyman had moved from Poland to Egon Pearson’s department at UCL in 1934 but in 1938 he moved to the University of California, Berkeley. There he built a very strong group which included such notable figures as David Blackwell, J. L. Hodges, Erich Lehmann, Lucien Le Cam (memorial Grace Yang) Henry Scheffé and Elizabeth Scott (memorial). Neyman’s Berkeley is nicely evoked in Lehmann’s Reminiscences of a Statistician Amazon.




Harald Cramér (1893-1985) Mathematician, statistician & actuary. MacTutor References.  MGP. Photos  SC, LP.  Personal recollections Statistical Science 1986. Cramér studied at the University of Stockholm and spent his working life there. His career spanned the applied mathematics of insurance and the pure mathematics of number theory. In Sweden probability, statistics and actuarial science were more closely related than elsewhere; see Cramér’s talk Actuaries and Actuarial Science. Filip Lundberg was a symbol of the link between insurance and probability and the Skandinarvisk Aktuarietidskrift was the main statistics journal. From the mid-20s probability became increasingly prominent in Cramér’s research. In 1929 a chair in “Actuarial Mathematics and Mathematical Statistics” was created for him. Cramér’s Random Variables and Probability Distributions (1937) has been called “the first modern book on probability in English”; he was encouraged to write it by G. H. Hardy the British number theorist and analyst. Cramér’s early work was on the central limit theorem, and treated the expansions associated with Edgeworth (Edgeworth series), Gram and the astronomer Charlier. Cramér was an important synthesiser of subjects and of national traditions. His student Herman Wold brought together the individual processes, studied by Yule in the English statistical literature, and the theory of stationary stochastic processes, studied by Khinchin in the Russian mathematical literature. Cramér’s own Mathematical Methods of Statistics (1945) brought together English statistical theory and Continental probability. Amongst the new results it contained was the Cramér-Rao inequality. Cramér’s main work from 1940 onwards was on stochastic processes, where he extended the theories of Kolmogorov and Khinchin. Cramér made an important contribution to the anglicising of probability language and so his name often appears on the Words pages. See also Symbols in Probability.



Description: Definetti.jpg (31267 byte)

Bruno de Finetti (1906-85) Mathematician, actuary & statistician. MacTutor References Wikipedia LP.  De Finetti studied mathematics in Milan. He was very precocious and very prolific, publishing the first of his 300 works while still a student. Although de Finetti became the best known of the Italian probabilists, there was already an Italian presence on the international scene. Castelnuovo’s textbook, Calcolo della probabilità (1919), was comparable to Markov’s and Cantelli’s (LP) work on the strong law of large numbers made him a pioneer of modern probability. The journal which Cantelli founded, Giornale dell'Istituto Italiano degli Attuari, published important probability contributions in the 1930s; for Cantelli see Regazzini. Corrado Gini  (LP) (SC) founded the Italian Central Statistical Institute, and also the journal Metron.  De Finetti worked for a time at the Statistical Institute and, as well as being associated with the universities of Trieste and Rome, he did actuarial work. While de Finetti made important contributions to probability theory, he is best known for his subjective theory of probability, based on the Dutch book argument for coherence In the English speaking world de Finetti’s work only became known in the 1950s when Savage drew attention to it. De Finetti’s work has attracted a lot of attention from philosophers. See for example Richard Jeffrey and his book Subjective Probability. See von Plato ch. 8: “De Finetti’s subjective probabilities.” See the website Bruno de Finetti.



Description: William Feller

William Feller (1906-70) Mathematician. MacTutor References. MGP. About half of Feller’s papers were in probability, the rest were in calculus, functional analysis and geometry. After a first degree at the University of Zagreb Feller went to the University of Göttingen, where the world’s leading mathematics department was presided over by David Hilbert, Feller’s ideal mathematician. Feller’s supervisor was Richard Courant. Feller was awarded his Ph.D. in 1926, aged 20. In 1933 he left Germany first for Denmark and then for Sweden, joining Cramér at the University of Stockholm. He moved to the USA in 1939, first to Brown and then to Cornell and Princeton. Feller’s first contribution to probability was a 1935 paper on the central limit theorem; he obtained similar results to Lévy. Feller was the main architect of renewal theory. In the 50s he worked on a theory of diffusion, which brought together functional analysis differential equations and probability. Feller had numerous PhD students who became influential probabilists; one unofficial student was Frank Spitzer. The publication in 1950 of volume 1 of Feller’s Introduction to Probability Theory and its Applications was a major event. Gian-Carlo Rota wrote, “Together with Weber’s Algebra and Artin’s Geometric Algebra this is the finest text book in mathematics in this century.” Besides giving new results and new forms to old results, the book drew attention to a vast body of applied probability work that had not been noticed in the theoretical literature. Feller’s frequent appearance on the Symbols in Probability and Words pages (e.g. sample space and experiment) testify to the influence of the book. See J. L. Doob “William Feller and Twentieth Century Probability” in AMS History of Mathematics, Volume 3 and von Plato (ch. 7) “Kolmogorov’s measure theoretic probabilities.” There is an excellent biography William Feller (1906-1970) by Darko Zubrinic.




J. L. Doob (1910-2004) Probabilist. MacTutor References. MGP. SS Interview. Because Wiener was not part of the probability community, Doob was the first “modern probabilist” from the United States or even from the English-speaking world. When Doob came on the scene the only American probability textbook was the 1925 book by J. L. Coolidge, which could almost have been written in 1885. Harvard, where Doob studied, had a strong group of mathematicians but nobody worked on probability. Statistics was much livelier and Doob came into contact with probability and its European literature when he got a job with the statistician Harold Hotelling. Some of Doob’s early work (1934-6) was devoted to making statistical theory more rigorous. His first idea for a topic for his PhD student Paul Halmos was that he should make some of Fisher’s ideas rigorous. Halmos switched to a more realistic topic. Doob’s career was almost entirely spent at the University of Illinois. Although Doob did not begin in probability, almost all of his work was in this field. His main work was on stochastic processes; he was responsible for making martingales so prominent. His book Stochastic Processes (1954) was an important work of synthesis. His Classical Potential Theory and its Probabilistic Counterpart (1984) brought together Doob’s probabilistic and non-probabilistic interests. Doob made an important contribution to anglicising the language of probability theory and he appears often in this capacity on the Words pages—see e.g. probability measure and Markov process.  Apart from Halmos, his best-known student was David Blackwell who became part of Neyman’s group at Berkeley. For a retrospective on Stochastic Processes see N. H. Bingham (2005) Doob: a half-century on, Journal of Applied Probability, 42, 257–266. D. Burkholder and P. Protter have some personal reminiscences here. An issue of the JEHPS is devoted to “the splendours and miseries of martingales.”



1940-1950 Among the millions who died in the Second World War were mathematicians and statisticians. Doeblin is only the best known of those killed; one of Neyman’s books is dedicated to 10 lost colleagues and friends. Yet this war, unlike the First World War, promoted the study of statistics and probability. At the end of the war there were many more people working in statistics, there were new applications and the importance of the subject to society was more widely recognised.

·    The Nazi persecutions and the Second World War drove many statisticians and mathematicians to the USA. There was already a pattern of migrants seeking better opportunities. Many important figures in post-war US probability and statistics, including Feller, M. Kac (MGP), Wald, G. E. P. Box (MGP), W. G. Cochran (ASA) (MGP), W. Hoeffding (MGP), H. O. Hartley (MGP), F. J. Anscombe (Obit. p. 17) (MGP), Z. W. Birnbaum (MGP) and O. Kempthorne (MGP), were from Europe. From around 1950 Indian statisticians, following the example of R. C. Bose, began migrating to the US. See Rao.

·    The war brought many people into statistics and probability. Savage and Tukey are examples from the US while in Britain the recruits included Barnard (MGP), Box (MacTutor) (MGP),  Cox (MGP), D. G. Kendall (MGP), Lindley (MGP). The recruits were often better trained in abstract mathematics than earlier statisticians. This contributed to closing the gap between the English statistical and the Continental probability traditions.

·    The war generated research problems out of which came Wiener’s work on prediction and Wald’s on sequential analysis and the new subject of operations research. Governments’ need for information led to great expansion in the production of official statistics. In Britain the leading figure in economic statistics was Richard Stone ET interview.

·    Between 1943 and –6 three advanced treatises on statistics appeared, by Cramér, M. G. Kendall and Wilks (MGP). These works did much to consolidate the subject and thereby professionalise it.

·    Nonparametric methods began to be systematically studied, using tools from the theory of statistical inference; E. J. G. Pitman was an important pioneer. The tests often came originally from non-statisticians, like Spearman (rank correlation) or Wilcoxon (Wilcoxon tests). The existing repertoire of sign test, permutation test and Kolmogorov-Smirnov test was soon expanded.

·    Modern time series analysis came from the union of the theory of stochastic processes (see Khinchin and Cramér), the theory of prediction (Wiener and Kolmogorov) and the theory of statistical inference (Fisher and Neyman) with harmonic analysis and correlation among the grandparents. (above) One of the main pioneers of the 40s was M. S. Bartlett. In the 50s Tukey was a leading contributor, in the 60s Kalman (Kalman filter) and systems engineers made important contributions and in the 70s the methods of G. E. P. Box Interview and G. M. Jenkins (Box-Jenkins) were adopted in economics and business.





Abraham Wald (1902-1950) Statistician. MacTutor References. MGP. LP.  Wald studied at the University of Vienna with Karl Menger writing a thesis and several articles on geometry. To make a living Wald worked in economics, publishing on both mathematical economics and on economic statistics. His first work in probability was on the collective of von Mises. When Wald moved to the USA in 1938 he went to Columbia University where Harold Hotelling was the senior statistician. Wald started working on statistics in 1939 and soon developed his first ideas on decision theory, which he saw as an extension of Neyman-Pearson testing theory. During the war Wald worked on statistical problems for the military—one result was the development of sequential analysis. The Statistical Research Group at Columbia University was one of the most important wartime groups. After the war Wald developed his decision theory ideas further in the book Statistical Decision Functions (1950). In his short career Wald made contributions to most branches of statistical theory; among them the eponymous Wald test. He played an important part in the statistical developments in econometrics, which led to a Nobel prize for Trygve Haavelmo. Wald’s co-authors include Jacob Wolfowitz and H. B. Mann. Among his PhD students were Herman Chernoff, Charles Stein and Milton Sobel. Jack Kiefer had started his PhD when Wald was killed in an air crash.




C. R. Rao  (b. 1920) Statistician. MacTutor References ASA  MGP. Recent photo. Rao is the most distinguished member of the Indian statistical school founded by P. C. Mahalanobis and centred on the Indian Statistical Institute and the journal Sankhya. Rao’s first statistics teacher at the University of Calcutta was R. C. Bose. In 1941 Rao went to the ISI on a one-year training programme, beginning an association that would last for over 30 years. (Other ISI notables were S. N. Roy in the generation before Rao and D. Basu one of Rao’s students.) Mahalanobis was a friend of Fisher and much of the early research at ISI was closely related to Fisher’s work. Rao was sent to Cambridge to work as PhD student with Fisher, although his main task seems to have been to look after Fisher’s laboratory animals! In a remarkable paper written before he went to Cambridge, Information and the accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc. (1945) 37, 81-91 Rao published the results now known as the Cramér-Rao inequality and the Rao-Blackwell theorem. A very influential contribution from his Cambridge period was the score test (or Lagrange multiplier test), which he proposed in 1948. Rao was influenced by Fisher but he was perhaps as influenced as much by others, including Neyman. Rao has been a prolific contributor to many branches of statistics as well as to the branches of mathematics associated with statistics. He has written 14 books and around 350 papers. Rao has been a very international statistician. He worked with the Soviet mathematicians A. M. Kagan and Yu. V. Linnik (LP) and since 1979 he has worked in the United States, first at the University of Pittsburgh and then at Pennsylvania State University. He was elected to the UK Royal Society in 1967; in 2001 he received the Indian Padma Vibhushan award (biography); he received the US National Medal of Science in 2002.  See ET Interview: C. R. Rao,  ISI interview and profile. For a general account of Statistics in India, see B. L. S. Prakasha Rao’s About Statistics as a Discipline in India.



1950-1980  Expansion continued—more fields, more people, more departments, more books, more journals! Computers began to have an impact—see below for more details.

·    Existing departments were expanded: e.g. in 1949 a second chair of statistics was created at LSE filled by M. G. Kendall. Bowley (above) only ever had a staff of 2, E. C. Rhodes and R. G. D. Allen.  New institutions were created, e.g. the Statistical Laboratory at Cambridge  in 1947 and the Statistics Department at Harvard in 1958.

·    The scope of probability theory increased with the emergence of new sub-fields such as queueing theory and renewal theory. Feller’s Introduction to Probability Theory made a very strong impact on in the English-speaking world; it promoted the study of the subject and made advanced topics, like Markov chains, accessible.

·    In 1950 the logician/philosopher Rudolf Carnap published a major work, Logical Foundations of Probability which advanced a dual interpretation of probability, as degree of confirmation, which looked back to Cambridge (see Jeffreys), and as relative frequency, which looked back to von Mises. Probability was an important topic for other philosophers of science, including Hans Reichenbach and Karl Popper. More recently philosophers have been attracted to the monism of de Finetti’s subjectivism. Alan Hajek’s Interpretations of Probability in the Stanford Encyclopedia of Probability reviews the modern scene.

·    In statistics there was a Bayesian revival. In Britain I. J. GoodProbability and the Weighing of Evidence (1950)—was influenced (positively and negatively) by the Cambridge logicians (see Jeffreys). The American version, Bayesian decision theory, reflected more the influence of Wald’s classical decision theory. The most important early contributions were Savage’s Foundations of Statistics (1954) and Howard Raiffa and Robert Schlaifer’s Applied Statistical Decision Theory (1961).

·    W. Edwards Deming (ASQ) Wikipedia (LP) continued Shewhart’s work on quality control and was very effective in getting industry to adopt these methods.

·    There was a great expansion in medical statistics and epidemiology. Austin Bradford Hill was an important contributor to both fields: he pioneered randomised clinical trials and, in work with Richard Doll, demonstrated the connection between cigarette smoking and lung cancer.

·    Laplace and Quetelet saw the work of the census as a possible application of probability but the use of statistical theory by official data gatherers only became institutionalised through the activities of Morris Hansen (interview) at the US Census Bureau.

·    Since the 50s finance has been an important area for applied decision theory: the 1990 Nobel Prize in economics was awarded to Markowitz for work that was influenced by Savage although the idea of expected utility goes back to Daniel Bernoulli. Since the 70s finance has been an important area for applied stochastic processes. Ito had developed his stochastic calculus in the 40s but it was applied in an unexpected way in the Black-Scholes model for pricing derivatives. Scholes and Merton received the 1997 Nobel Prize in economics for their contribution (see Black-Scholes formula). The intellectual ancestor of stochastic finance was Bachelier (above).

·    In 1973 the Annals of Mathematical Statistics (see above) split into the Annals of Probability and the Annals of Statistics. This represented increasing specialisation—there were weren’t many new Cramérs—as well as the need to expand journal pages.




Description: Jimmie Savage

Leonard Jimmie Savage (1917-71) Statistician. MacTutor References. MGP LP. After training as a pure mathematician and obtaining a PhD from the University of Michigan Savage worked briefly with von Neumann in Princeton. Savage became a statistician in the war, joining the Statistical Research Group at Columbia University, which included Wald. Savage’s earliest publications were in the style of Wald’s classical decision theory. The change came with the Foundations of Statistics (1954). This work, written while Savage was at the University of Chicago, continued the decision theme but provided the basis for a Bayesian approach to probability in the spirit of Ramsey and de Finetti. Savage also drew on the utility theory von Neumann and Morgenstern developed for use in the theory of games. The maxim of maximising expected utility went back to Daniel Bernoulli. However the book’s main take on statistics was still classical and Savage criticised the foundational work of Jeffreys who had provided the only set of Bayesian methods that reflected C20 statistics. Schlaifer was quicker to develop practical Bayesian methods. However, Savage became a strong advocate of Bayesian methods in statistical research and a champion of such non-classical principles as the likelihood principle. This principle was treated axiomatically by Allan Birnbaum but it had been discussed earlier by Fisher and Barnard. Jimmie’s younger brother, I. Richard Savage (1926-2004), was also a distinguished statistician. (Obit. p. 8) MGP




John W. Tukey (1915-2000) Statistician. MacTutor References. ASA and Bell Labs. MGP. Tukey originally trained as a topologist (see finite character and Zorn's lemma) but became a statistician in the Second World War. He remained in Princeton but worked on statistics with Wilks and Cochran. In parallel with his university career Tukey had a long association with Bell Labs; another notable associated with this institution was Claude Shannon. Tukey did work on the foundations of statistics, notably on Fisher’s fiducial argument but his main contributions were in applied areas—experiments, time series analysis (see fast fourier transform window and pre-whitening), robust methods (see trimming and Winsorizing) and exploratory data analysis (see stem-and-leaf-displays). His enthusiasm for data analysis was in part an expression of frustration with certain tendencies in mathematical statistics. Tukey is a major presence on the Early Uses pages because he loved inventing names. His new names for old terms (e.g. hinge for median) did not always catch on but his names for his own creations, e.g. jackknife always did and he was ready to apply his facility outside statistics, see the entries on bit, linear programming and software. David Brillinger wrote a long tribute for the Annals of Statistics and the August 2003 issue of Statistical Science is dedicated to Tukey’s life and work. (Euclid). The American Philosophical Society has care of the Tukey Papers.



1980+  Instead of describing people from the very recent past, I describe the effect the computer has had on statistics from its advent, around 1950 and the changes in the writing of the history of probability and statistics in recent decades.


The effects of the computer. The changes following the introduction of the computer have been much more radical than those following the increased use of mechanical calculating machines at the end of the C19. Such machines provided the material basis for Pearson and Fisher’s research and for the construction of their statistical tables in the period1900-50. The machines were not in general use and Fisher assumed that most of the users of the tables and the “research workers” who read his book would use logarithm tables or a slide rule. For general background see A Brief History of Computing.


With the availability of computers old activities took less time and new activities became possible.

·    Statistical tables and tables of random numbers first became much easier to produce and then they disappeared as their function was subsumed into statistical packages.

·    Much bigger data sets could be assembled and analysed.

·    Exhaustive data-mining became possible.

·    Much more complex models and methods could be used. Methods have been designed with computer implementation in mind—a good example is the family of Generalized linear models linked to the program GLIM; see John Nelder FRS.

·    In the early C20 when Student (1908) wrote about the normal mean and Yule (1926) about nonsense correlations they used sampling experiments and in the 1920s it became worthwhile to produce tables of random numbers. With the introduction of computer-based methods for generating pseudo random numbers much more ambitious Monte-Carlo investigations (introduced by von Neumann and Ulam) became possible. The Monte-Carlo experiment became a standard way of investigating the finite sample behaviour of statistical procedures.

·    Since around 1980 Monte Carlo methods have been used directly in data analysis. In classical statistical inference the bootstrap has been very prominent; Statistical Science’s silver anniversary issue (Euclid) includes an interview with Efron, the creator. In Bayesian analysis Markov Chain Monte-Carlo methods have been used extensively; previously conjugate priors and noninformative priors had been used because of computational limitations.




Writing history. In recent decades there has been a flood of works—books and articles—on the history of probability and statistics from statisticians, philosophers and historians. I will mention a few titles in each category to indicate the range of activity.

·    50 years ago the standard general works were Todhunter for the history of probability, Walker for statistics with an emphasis on psychology and education and Westergaard for statistics with an emphasis on economic and vital statistics.

Helen M. Walker (1929) Studies in the History of Statistical Method, Baltimore: Williams & Wilkins.

Harald Westergaard (1932) Contributions to the History of Statistics, London: King.

·    E S Pearson got history moving in Britain. In 1955 Biometrika published the first of its Studies in the History of Statistics and Probability; as editor, ESP, wrote “It is hoped to publish articles by a number of different authors under this general heading.” So far, around 50 articles have appeared. ESP wrote about the people he knew—his father, Fisher and Student—but the other pioneers, F N David and M G Kendall, chose more remote topics. Two collections have appeared, Studies in the History of Statistics and Probability (1970) and Studies in the History of Statistics and Probability, Volume II, (1977). See here for a list of the articles. In addition, David wrote a book on the early history of statistics and Pearson edited and published his father, Karl’s, lectures; both works are very different from Todhunter. 

F. N. David (1962) Games, Gods and Gambling: the Origins and History of Probability and Statistical Ideas from the Earliest Times to the Newtonian Era. Griffin, London.

E. S. Pearson (ed) (1978) The History of Statistics in the 17th and 18th Centuries against the Changing Background of Intellectual, Scientific and Religious Thought: Lectures by Karl Pearson given at University College, 1921-1933. London: Griffin.

·    Oscar Sheynin, Anders Hald and Stephen Stigler (see above) have been the leading contributors to the technical literature. Sheynin has published many articles, mainly in the Archive for History of Exact Sciences. There is a list of Hald’s history writings here. Some of Stigler’s articles are reprinted in

Stephen M Stigler (1999) Statistics on the Table: The History of Statistical Concepts and Methods, Cambridge, MA: Harvard University Press.

·    Much neglected work has been rediscovered. Bienaymé’s (LP SC) name is now linked to branching processes and Chebyshev's inequality and his total contribution is recognised. There are cases of individuals, famous for other work, e.g. Abbe (LP) or Einstein, (Brillinger Time Series) and of individuals who may be known for one topic but who produced a large body of work, e.g. Thiele (LP SC) who had been identified with semi-invariants. Hald’s book is a major addition to the literature of the overlooked for, as well as revealing a largely forgotten Laplace, it describes many continental European developments that were overlooked in the Anglo-centric statistical literature of the C20. (see above)

C. C. Heyde & E. Seneta (1977) I. J. Bienaymé: Statistical Theory Anticipated, New York: Springer.

S. L. Lauritzen (2002) Thiele: Pioneer in Statistics, Oxford: Oxford University Press. (first chapter)

A. Hald (1998) A History of Mathematical Statistics from 1750 to 1930, New York: Wiley.

·    Among the philosophers are Hacking and von Plato; as well as the books referred to above, see

I. Hacking (1990) The Taming of Chance Cambridge: Cambridge University Press.

·    In recent decades the history and sociology of science have flourished. T. S. Kuhn’s Structure of Scientific Revolutions (1962) had a strong influence on both fields. Among the works on probability and statistics by historians and sociologists are

T. M. Porter (1986) The Rise of Statistical Thinking 1820-1900, Princeton: Princeton University Press. (contents)

L. Daston (1988) Classical Probability in the Enlightenment. Princeton: Princeton University Press Amazon

The Probabilistic Revolution, volume 1 edited by L. Krüger, L. J. Daston and M. Heidelberger, volume 2 edited by L. Krüger, G. Gigerenzer and M. S. Morgan Cambridge, Mass.: MIT Press (1987) contents

D. A. MacKenzie (1981) Statistics in Britain 1865-1930: The Social Construction of Scientific Knowledge. Edinburgh: Edinburgh University Press.

There is a review essay of Daston and Krüger et al. by MacKenzie in Isis, 80, (1989), pp. 116-124.

·    Much effort has gone into making important texts available

S. M. Stigler and I. M. Cohen American Contributions to Mathematical Statistics in the Nineteenth Century, contents (includes work by Adrain, De Forest, Newcomb, B. Peirce, C.S. Peirce (Stanford) and others)

H. A. David & A. W. F. Edwards (eds.) (2001) Annotated Readings in the History of Statistics, New York: Springer. Amazon (Its Appendix A lists English translations of works of interest to historians of statistics.)


·    Post-1940 developments have not attracted the attention of historians yet. Some classic modern contributions are reprinted (with commentary) in S. Kotz & N. L. Johnson (Editors) (1993/7) Breakthroughs in Statistics: Volume I-III New York Springer Amazon. Statistical Science has been publishing interviews for the past 20 years and these are a form of living history—there must be 100 by now; the post-1995 issues are available through Euclid. Econometric Theory also publishes interviews and articles on history; among the statisticians interviewed are T. W. Anderson and J. Durbin.

·    Probability and statistics now appear as topics in textbooks on the history of mathematics and the history of disciplines that use probability and statistics. See for example

V.J. Katz (1993) A History of Mathematics, New York: HarperCollins.

M. Cowles (2001) Statistics in Psychology: An Historical Perspective, London: Erlbaum

·    Articles on the history of probability and statistics appear in several journals including, Archive for History of Exact Sciences  Biometrika  British Journal of the History of Science  Historia Mathematica  International Statistical Review  Isis  Journal of the History of the Behavioral Sciences  Statistical Science.

·    In 2005 a specialist online journal was launched, Electronic Journal for History of Probability and Statistics/Journal Electronique d'Histoire des Probabilités et de la Statistique.



Online bibliographies


The JEHPS  lists  Publications in the History of Probability and Statistics. The current lists cover 2005-9.


Stigler’s History of Statistics has suggestions for further reading and an extensive bibliography. Hald’s History of Mathematical Statistics from 1750 to 1930 also has a very valuable bibliography.


There are several online bibliograpies. Two of the bibliographies were compiled in the mid-90s but are still useful—a brief one by Joyce, restricted to secondary sources, and an extensive one by Lee

Oscar Sheynin References

Giovanni Favero Storia della Probabilità e della Statistica

David Joyce History of Probability and Statistics

Peter M Lee The History of Statistics: A Select Bibliography


Online texts


Through the web many of the important original texts are now easily accessible. The following open access sites are very useful.

DML: Digital Mathematics Library retrodigitized Mathematics Journals and Monographs

Life and work of Statisticians

SAO/NASA ADS Astronomy journals

Gallica particularly good for French literature

Internet Archive

Google books