Figures from the History of Probability and Statistics
John Aldrich, University of
Southampton, Southampton, UK. (home)
June 2005. Latest changes October 2012
Notes on the work of
A further 200+ individuals are mentioned below. Use Search on your browser to find the person you are
interested in. It is also worth searching for the ‘principals’ for they can pop
The entries are arranged chronologically, so the document
can be read as a story. These are the
with people placed according to
date of their first impact. Do not take the placings too seriously and remember
that a career may last more than 50 years! At
each marker there are notes on developments in the following period. There is
more about Britain
and about economics than there should be but I know more about them.
For further on-line information there are links to
Earliest Uses (Words and Symbols) for details
(particularly detailed references) on the topics to which the individuals
contributed. (The Words site is organised by letter of the alphabet. See
for a list of entries)
fuller biographical information on the ‘principals’ (all but three) and
on a very large ‘supporting’ cast. The MacTutor biographies also cover
the work the individuals did outside probability and statistics. The MacTutor
and References links are to these pages. There is an index
to the Statistics articles on the site.
in History for biographies of mainly recent, mainly US
Life and Work
of Statisticians (part of the Materials for
the History of Statistics site) for further links, particularly to
Oscar Sheynin’s Theory of Probability: A
Historical Essay An account of developments to the beginning of
the twentieth century, particularly useful for its coverage of Continental work
Todhunter’s classic from 1865 A History of
the Mathematical Theory of Probability : from the Time
of Pascal to that of Laplace for detailed commentaries on the
contributions from 1650-1800. The coverage is extraordinary and the entries are
still interesting—even their humourlessness has a certain charm.
Genealogy Project, abbreviated MGP, which
is useful for tracking modern scholars. The PhD degree is a relatively recent
development and in the UK
a very recent one. See my The Mathematics PhD in the UK.
Wikipedia for additional
biographies. This is an uneven site but it has some useful articles.
contain references to the following histories
and books of lives. See below for more literature.
Ian Hacking The
Emergence of Probability, Cambridge, Cambridge University Press 1975. (contents)
Stephen M Stigler The
History of Statistics: The Measurement of Uncertainty before 1900, Cambridge, MA: Harvard University Press 1986. (contents + bibliography)
Anders Hald A History of
Probability and Statistics and their applications before 1750, New York: Wiley 1990. (contents)
Anders Hald A History of
Mathematical Statistics from 1750 to 1930, New York: Wiley 1998. (contents + bibliography)
Jan von Plato Creating
Modern Probability, Cambridge:
Cambridge University Press, 1994. (contents)
Leading Personalities in Statistical Sciences from the Seventeenth
Century to the Present, (ed. N. L. Johnson and S. Kotz) 1997. New York: Wiley.
Contains around 110 biographies and based on entries in Encyclopedia of Statistical Science (ed. N. L. Johnson and S.
Kotz.) Abbreviated LP.
Statisticians of the
Centuries (ed. C. C. Heyde and E. Seneta) 2001. New York: Springer. Contains 105
biographies. The coverage is restricted to individuals born before 1900. Abbreviated
Encyclopedia of Social
Measurement (ed. K. Kempf-Leonard) 2004. New York: Elsevier (description).
Contains numerous biographies. Abbreviated ESM.
web (see also online biblios and texts below)
Encyclopedia. Articles in English and German.
of Statisticians on the Materials for
the History of Statistics site.
in the History of Probability and Statistics by Richard J. Pulskamp.
Vignettes by E. Bruce Brooks.
of Statistics and Probability 18 short biographies from University
of Minnesota Morris.
Glimpses of the Prehistory of Econometrics. Montage
by Jan Kiviet.
and Statistics Ideas in the Classroom: Lesson from History.
Comments on the uses of history by D. R. Bellhouse.
History of Statistics in the Classroom.
Thumbnail sketches of Gauss, Laplace and Fisher by H. A. David.
in the History of Thematic Cartography, Statistical Graphics, and Data
Encyclopaedic coverage by M. Friendly & D.J. Denis.
History. A very
comprehensive collection of links, created by Henk Wolthuis, not only to actuarial science and
demography, but to statistics as well.
To help place individuals I have
used modern terms for occupation (e.g. physicist or statistician). For the
earlier figures these terms are anachronistic but, I hope, not too misleading.
I have not given nationality as people move and states come and go. MacTutor
has plenty of geographical information.
1650-1700 The origins of probability and statistics are usually found
in this period, in the mathematical treatment of games of chance and in the
systematic study of mortality data. This was the age of the Scientific
Revolution and the biggest names, Galileo
ch.I (4-6).) and Newton
(LP) gave some thought to probability without apparently
influencing its development. For an introduction to the Scientific
Revolution, see Westfall’s Scientific
were earlier contributions to probability, e.g. Cardano
(1501-76) gave some ‘probabilities’ associated with dice throwing, but a
critical mass of researchers (and results) was only achieved following
discussions between Pascal
and the publication of the first book by Huygens. Hacking Chapters
1-5 discusses thinking before Pascal. James Franklin’s The Science of Conjecture:
Evidence and Probability Before Pascal (2001) examines this earlier
work in depth. A recent issue of the JEHPS is devoted to Medieval
in the form of population statistics was created by Graunt. Graunt’s friend William Petty
gave the name Political Arithmetic
to the quantitative study of demography and economics. Gregory King
was an important figure in the next generation. However the economic line
fizzled out. Adam Smith, the most influential C18 British economist,
wrote, “I have no great faith in
political arithmetic...” Wealth of Nations (1776) B.IV,
Ch.5, Of Bounties.
A form of life insurance mathematics was developed from Graunt’s work on the life table by the
Witt. Many later ‘probabilists’ wrote on actuarial matters,
including de Moivre, Simpson,
Cramér and de
Finetti. In the C20 the Skandinavisk aktuarietidskrift and the Giornale dell'Istituto Italiano degli
Attuari were important journals for
theoretical statistics and probability. Actuarial questions and
friendship with the actuary G.
J. Lidstone stimulated the Edinburgh mathematicians, E.
T. Whittaker and A.
C. Aitken (MGPP),
to contribute to statistics and numerical
analysis. The C17 work is discussed by
Hacking (1975): Chapter
13, Annuities. See also Chris Lewin’s The Creation of
Actuarial Science and the other historical links on the International
page. There are historical articles in the Encyclopedia of Actuarial Science.
Classics are reprinted in History
of Actuarial Science. There is a nice review of the early
literature in the catalogue
of the Equitable Life Archive.
rather than the traditional universities, underpinned these developments. In Paris and London
private discussion groups, like that of Mersenne,
were forerunners of the Académie
des Sciences and the Royal
Society of London (archives). The latter’s Philosophical Transactions (Gallica) published many important
contributions to probability and statistics, including papers by Halley,
de Moivre, Bayes,
Pearson, Fisher, Jeffreys
and Neyman. The Berlin
Petersburg academies were formed a bit later. The Royal Society was
a forerunner of the modern scientific society, while the continental academies
were more like research institutes.
Work has links to the writings of many of these people. For the
period generally see Todhunter ch.
I-VI (pp. 1-55) and Hald (1990, ch. 1-12).
Blaise Pascal (1623-1662) Mathematician and philosopher. MacTutor References SC, LP. Pascal
was educated at home by his father, himself a considerable mathematician. The
origins of probability are usually found in the correspondence
between Pascal and Fermat where they treated several problems associated
with games of chance. The letters were not published but their contents were
known in Parisian scientific circles. Pascal’s only probability publication
was the posthumously published Traité du
triangle arithmétique (1654, published in 1665 and so after Huygens’s work); this treated Pascal’s
triangle with probability
applications. Pascal introduced the concept of expectation and discussed the problem of gambler’s
wager, is now often read as a pioneering analysis of
decision-making under uncertainty although it appeared, not in his
mathematical writings, but in the Pensées, his reflections on
religion. The last chapter of the Port-Royal
365ff by Pascal’s friends Arnauld
and Nicole has a brief treatment of the use of probability in decision making,
with an allusion to the wager. See Ben Rogers Pascal's
Life & Times, Life
& Work, A.W.F.
Edwards on the triangle and Todhunter
ch.II (pp. 7-21). See also Hald 1990, chapter 5, The
Foundations of Probability Theory by Pascal and Fermat in 1654 and Hacking
7, The Roannez
Circle (1654) and chapter
8, The Great Decision (1658?).
Huygens (1629-94) Mathematician and physicist. MacTutor
References SC, LP. As a youth Huygens was expected
to become a diplomat but instead he became a gentleman scientist, making
important contributions to mathematics, physics and astronomy. He was educated at the University of Leiden
and at the College of Orange at Breda.
He spent 14 years in Paris
at the Académie
des Sciences. Huygens wrote the first book on probability, a pamphlet
really, Van Rekeningh in Spelen van Geluck, translated into Latin by
his teacher van
Schooten as De Ratiociniis in Ludo Aleae (1657) and then into
English as The
Value of all Chances in Games of Fortune etc. Huygens drew on the
ideas of Pascal and Fermat, which he had encountered when
he visited Paris.
Much of the book is devoted to calculating the value or, as it would be
called now the expectation, of a
game of chance. The problems contained in the book include the gambler’s
ruin and Huygens treated the hypergeometric distribution.
His book was widely read and the first part of
Jakob Bernoulli’s Ars Conjectandi is a
commentary on it. See Life
& Work and Todhunter
ch.III (pp. 22-5). See also Hald (1990): Chapter 6, Huygens
and De Ratiociniis in Ludo Aleae, 1657 and Hacking (1975), Chapter
11, Expectation. C.J. (Kees) Verduin is constructing an
Huygens site. See Peter Doyle Hedging
No authentic portrait of Graunt is known
John Graunt (1620-74) Merchant. Wikipedia. SC, LP, ESM. Graunt is unique among the figures described here in
not having had a university education. He published only one work, Observations
Made upon the Bills of Mortality (1662). However, through
this work and his friendship with William Petty,
he became a fellow of the Royal Society of London and his work became known
to savants like Halley.
The weekly bills of mortality, which had been collected since 1603, were
designed to detect the outbreak of plague. Graunt
put the data into tables and produced a commentary on them; he does basic
calculations. He discusses the reliability of the data. He compares the
number of male and female births and deaths. In the course of Chapter XI on
estimating the population of London Graunt produces a primitive life table see the JIA
articles by Glass, Renn, Benjamin and Seal. The life table became one of the
main tools of demography and insurance mathematics. Halley produced a life
table using data from Caspar
Neumann (SC) in Silesia. See Life &
Work for writings by Graunt, Petty and Halley. See also Todhunter
ch. V (pp. 37-43), Hald (1990): Chapter 7, John Graunt and the
Observations upon the Bills of Mortality, 1662 and
Hacking (1975): Chapter
12, Political Arithmetic 1662.
The great leap forward
is Hald’s (1990) name for the decade 1708-1718: there were so many important
contributions to such a greatly expanded subject. The roots of probability and
statistics were quite distinct but by the early C18 it was understood that the
subjects were closely related.
Jakob Bernoulli’s Ars Conjectandi, like Arnauld’s
Logique (1682) pp.
365ff, suggested a conception of probability broader than that
associated with games of chance. Bernoulli’s
law of large numbers
provided a theory to link between probability and data. See Hacking
(SC, LP) Essay
d'analyse sur les jeux de hazard (1708) and de Moivre’s Doctrine of
Chances (1718) produced many new results on games of chance, greatly
extending the work of Pascal
Arbuthnot’s (SC, LP) 1710 paper An Argument
for Divine Providence, taken from the constant Regularity observed in the
Births of both Sexes used a significance test (sign test) to establish
that the probability of a male birth is not ½. The calculations were refined by
Bernoulli (LP). Apart from being an early application of probability
to social statistics, Arbuthnot’s paper illustrates the close connection
in the literature of the time. The work of John
Craig provides another example.
of the valuation of a risky prospect, dramatised by the St. Petersburg
paradox (formulated by Nicholas
Bernoulli in 1713 and discussed by Gabriel
Cramer) led to Daniel Bernoulli’s (1737)
theory of moral expectation (or expected utility).
See Life &
Work for writings by Montmort, Euler, Lagrange, etc. For the period
see Todhunter ch.
VII-X (pp. 56- 212) Hald
(1990, ch. 1-12), Hacking Chapter
Jakob (James) Bernoulli (1654-1705)
References SC, LP, ESM. Eight members of the Bernoulli family have
biographies in MacTutor
tree) and several wrote on probability. The most important
contributors were Jakob, Daniel and Niklaus.
Jakob and his younger brother Johann
were the first of the mathematicians. Jakob studied philosophy at the University of Basel but learnt mathematics on his
own. Eventually he became professor of mathematics at Basel. The posthumously published Ars
page) (1713) was his only probability publication but it
was extremely influential. The first part is a commentary on Huygens’s De Ratiociniis.
The work was an important contribution to combinatorics: the
term permutation originated here.
(The combinatorial side of probability remained important and in late C19
Britain probability and combinatorial analysis were taught together in
algebra courses.) Bernoulli used the terms a priori and a posteriori
to distinguish two ways of deriving probabilities (see posterior
probability): deduction a priori (without
experience) is possible when there are specially constructed devices, like
dice but otherwise it is possible to make a deduction from many observed
outcomes of similar events. Bernoulli’s theorem, or the
of large numbers was the work’s most spectacular contribution but see
also the entries for morally certain and binomial distribution The eponymous Bernoulli trials, numbers and random variable all refer to this
Bernoulli and his Ars Conjectandi. See
Sheynin ch. 3
& Work (which also has
links for the contributions of Niklaus and Jean III) and Todhunter
ch.VII (pp. 56-77). See Stigler (1986): Chapter
2, Probabilists and the Measurement of Uncertainty. See also Hald (1990): Chapter 15, James Bernoulli and Ars Conjectandi, 1713;
Chapter 16, Bernoulli’s
Theorem. Sheynin has recently translated Part IV of
the Ars Conjectandi. A new
translation of the whole work has appeared: The Art of Conjecturing translated by Edith Dudley Scylla Amazon.
The proceedings of a conference on the Ars Conjectandi are online at the JEHPS
: part 1 and
Abraham de Moivre (1667-1754) Mathematician MacTutor References SC, LP. De Moivre came to England
as a refugee aged about 20 and, although he gained recognition as a
mathematician and became a fellow of the Royal Society, he never obtained an
academic appointment. De Moivre had read Huygens’s book
before leaving France
but his first paper on probability was published in 1711. In 1718 he
published The Doctrine of Chances: or, a Method of Calculating the
Probability of Events in Play (title
page). He published other pieces on probability, putting the
results into new editions of the Doctrine; the third appeared in 1733.
The book began with an influential definition of probability.
De Moivre obtained the normal approximation to
the binomial distribution (a forerunner of the central limit theorem)
and almost found the Poisson distribution. His technical innovations included the use of probability
generating functions, which
he used to find the distribution of the sum of
uniform variables. De Moivre also wrote about life
insurance mathematics when analysing annuities
See survival function.
Todhunter ranked de Moivre’s contributions very highly, “it will not be
doubted that the Theory of Probability owes more to him than to any other
mathematician, with the sole exception of Laplace.”
(p. 139) Bellhouse & Genest have translated and augmented Maty’s
biography of De Moivre: Statistical
Science 2007 Project
Euclid. See Sheynin ch. 4 Life
& Work and Todhunter
ch.IX (pp. 135-93). See Stigler (1986): Chapter
2, Probabilists and the Measurement of Uncertainty. See also Hald (1990): Chapter 22,
De Moivre and the Doctrine of Chances 1718, 1738 and 1756; Chapter
25, The Life Insurance Mathematics of
de Moivre and Simpson.
Daniel Bernoulli (1700-1782) Mathematician and physicist MacTutor References SC, LP. Daniel
Bernoulli, a nephew of Jakob Bernoulli, was educated at the
University of Basel where his father Johann
was a professor. Daniel studied medicine—his father insisted—although
subsequently his father agreed to teach him mathematics. Daniel worked in St.
Petersburg and at the University
of Basel. He wrote nine
papers on probability, statistics and demography but is best remembered for
his "Exposition of a New Theory on the Measurement of Risk" (1737):
his theory of choice was based on moral expectation (or expected utility). The theory had a solution for the St.
Petersburg paradox, which
had exposed the difference between the mathematical expectation of a prospect and its value to ‘me’: its
expectation is infinite but its value to me is not. In a prize-winning paper
of 1735 Bernoulli tested for the random distribution of planetary orbits.
Bernoulli devised an urn model for treating the diffusion of liquids but,
although it was discussed by Laplace, such probabilistic
models only became common in the late C19.
Another contribution, the Essai
(pp.1-45) of 1766, was an investigation of the consequences of inoculation
against smallpox; see Blower
2004. This paper contained a model of epidemics of the kind that
Ross and McKendrick developed in the C20 (below). Bernoulli also described the method known since Fisher as maximum
likelihood in a 1777 paper “The most probable choice
between several discrepant observations and the formation therefrom of the
most likely induction.” See Life
& Work and Todhunter
ch. IX (pp. 213-38). See Hald
established itself in physical science, in astronomy its most developed branch. The most enduring
of these applications to astronomy treated the combination of observations. The resulting theory of errors was the most important ancestor of modern
statistical inference, particularly of estimation theory.
major mathematician/astronomers, including Daniel Bernoulli, Boscovich,
treated the problem of combining astronomical observations, “in order to
diminish the errors arising from the imperfections of instruments and the
organs of sense” in the words of Thomas
Simpson. Simpson introduced the idea of postulating an error distribution. See Hald (1998, Part I Direct
Probability, 1750-1805) and Richard J. Pulskamp’s Sources in the History of
Probability and Statistics.
tests of significance
were developed, mainly for use in astronomy, see Daniel Bernoulli
and also John
Michell (1767) Crossley, who calculated the odds that the
Pleiades is a system of stars and not a random agglomeration. See Hald
(1998): Part I Direct Probability, 1750-1805).
statements about the parameter of the Binomial distribution—ancestors of the
interval—were produced by Lagrange and Laplace in the 1780s.
the 1770s Condorcet
started publishing on social mathematics, largely the application of
probability to the decisions of juries and other assemblies. His work had a
strong influence on Laplace and Poisson. Other French authors from this period included D’Alembert
the former is best remembered for his critical remarks on probability and the
latter for his needle experiment.
An important development in
probability theory was work on conditional
probability with applications to inverse probability or Bayesian inference by
Bayes and Laplace. See Hald
(1998): Part II Inverse Probability.
ch. XI-XIX and Stigler (1986): Part I, The Development of Mathematical Statistics in Astronomy and
Geodesy before 1827. For this period and the next see also Lorraine Daston
Probability in the Enlightenment.
No authentic portrait of Bayes is known
(for an unlikely possibility see here)
Thomas Bayes (1702-1761) Clergyman and mathematician. MacTutor
References SC, LP, ESM. Bayes
attended the University
of Edinburgh to prepare
for the ministry but he studied mathematics at the same time. In 1742 Bayes
became a fellow of the Royal Society: the certificate
of election read “We propose and recommend him as a Gentleman of
known Merit, well Skilled in Geometry and all parts of Mathematical and
Philosophical Learning and every way qualified to be a valuable Member of the
Same.” Bayes wrote only one paper on probability, the posthumously published An Essay
towards solving a Problem in the Doctrine of Chances (1763). (For
statement of the “problem”, see Bayes). The paper was submitted to the Royal Society by Richard
Price who added a post-script of his own in which he discussed a
version of the rule of
succession. In the paper Bayes refers only to de Moivre and there has been much speculation as to where the
problem came from. Bayesian methods were widely used in the C19, through
the influence of Laplace
and Gauss, although both had second thoughts. Their
Bayesian arguments continued to be taught until they came under heavy attack
in the C20 from Fisher and Neyman. In
the 1930s and –40s Jeffreys was an isolated figure in
trying to develop Bayesian methods. From the 50s onwards the situation
changed when Savage and others made Bayesianism
intellectually respectable and recent computational advances have made Bayesian methods technically feasible. From the
early C20 there has been a revival of interest in Bayes
himself and he has been much more discussed than ever before. See Bellhouse
biography, Sheynin ch. 5
& Work and Todhunter
ch.XIV (pp. 294-300). See Stigler (1986): Chapter
3, Inverse Probability and Hald (1998): Chapter 8, Bayes, Price
and the Essay, 1764-1765.) There is a major new biography, A. I. Dale Most
Honorable Remembrance: The Life and Work of Thomas Bayes.
Pierre-Simon Laplace (1749-1827) Mathematician and physicist MacTutor
References SC, LP, ESM. Laplace
wrote on probability over a period of more than 50 years. His 1774 Mémoire sur la probabilité des
causes par les évènemens gave a Bayesian
analysis of errors of measurement. His Théorie
Analytique des Probabilités (title
page) (1812 and further editions in –14, -20 and -25) was
by far the biggest thing in probability yet. Laplace
made many contributions, producing results like the central limit theorem
(see also the normal
distribution) and developing tools including the probability generating function
and the characteristic
function. His system was based on classical probability
but the superstructure outgrew the foundations. In
Britain Laplace was admired by his C19 readers including De
of Théorie Analytique) and Edgeworth but, in the C20, his thought tended to be reduced
to the popular Philosophical Essay on Probability and
discussion often focussed on debateable items like the rule of succession. Fisher, for instance, had a very
contracted view of Laplace’s work. Although Laplace gets far more space in Todhunter
than any other author, the coverage of Laplace’s
estimation theory ends in 1814. In early C20 France (see Lévy)
Laplace seems also to have been largely
forgotten. Hald (1998) does justice to Laplace by giving him about 400
pages and by presenting subsequent work as a series of footnotes to Laplace. See
Sheynin ch. 5
& Work and Todhunter
ch. XX (p. 464-614). See Stigler (1986): Chapter
3, Inverse Probability; Chapter 4, The Gauss-Laplace Synthesis.
Laplace’s works are available on Gallica.
1800-1830 The contrasting
figures of Laplace and Gauss dominate this period. Laplace
covered the entire range of probability and statistics, while Gauss treated
only the theory of errors.
Work on the theory of errors reached a climax
with the introduction of the method
of least squares. The method was published by Legendre
in 1805 and within twenty years there were three probability-based
rationalisations, Gauss’s Bayesian argument (see uniform prior), Laplace’s argument based on the central limit theorem
and Gauss’s Gauss-Markov
theorem. Work continued through the C19 with numerous mathematicians and
astronomers, contributing, including Cauchy
(Cauchy distribution), Poisson,
(probable error), Encke, Peters
(Peters' method), Lüroth,
and Newcomb. (The Cauchy distribution
first appeared as an awkward case for the theory of errors.) Pearson, Fisher
and Jeffreys were taught the theory of errors by
astronomers. In the C20 astronomers, including Eddington,
also investigated the statistical properties of constellations, picking up from
the middle of the C18. (above)
Gauss found a second important
application of least squares in geodesy. Geodesists
made important contributions to least squares, particularly on the
computational side—not surprisingly as the calculations could be on an
industrial scale. The eponyms, Gauss-Jordan
and Cholesky MacTutor,
honour later geodesists. Helmert
and Paolo Pizzetti
were geodesists who contributed to the theory of errors. At least one
important C20 statistician started as a surveyor, Frank
Yates, Fisher’s colleague and successor at
In Britain the
first census of the population was taken in 1801. It ended a controversy about
the size of the population that began when Bayes’s friend Price argued that the population had been falling in the
C18. Numerous writers, including Eden,
came up with their own estimates.
Around this time William Playfair
was finding new ways of representing data graphically but nobody was paying
attention. Techniques slowly accumulated over the next 150 years without the
idea of graphical statistics as a study in its own right gaining ground. That
idea is quite recent and mainly associated with Tukey. See
The age of the academies
was over and from now on the main advances took place in universities. The
French education system was transformed in the course of the Revolution and the
C19 saw the rise of the German university.
See Stigler (1986): Part
I, The Development of Mathematical Statistics in Astronomy and Geodesy before
1827 and Hald (1998): Part III The Normal Distribution, the
Method of Least Squares and the Central Limit Theorem. See
& Work. See also L. Daston (1988) Classical
Probability in the Enlightenment.
Carl Friedrich Gauss (1777-1855)
Mathematician, physicist and geodesist.
References SC, LP. Gauss is generally regarded
as one of the greatest mathematicians of all time and his contributions to
the theory of errors
were only a small part of his total output. Gauss spent most of his working
life at the University of Göttingen, which became the main centre for
mathematics in Germany.
Initially Gauss was interested in treating astronomical observations but
later he became involved in geodesy. Gauss used the method of least squares
for which he gave two rationalisations. The first in The Theory of the
Motion of Heavenly Bodies moving around the Sun in Conic Sections (1809)
errors and used a Bayesian argument with a uniform prior on the
coefficients; the normal or Gaussian
curve was derived from the principle of the arithmetic mean. Gauss
departed from this Bayesian position in 1816 when he investigated the ‘efficiency’ of
different estimators of precision. In the Theory of the combination of
observations least subject to error (1821/3) he presented the Gauss-Markov theorem.
Gauss’s way of writing the Gaussian distribution is described on Symbols associated with normal
distribution; the associated terminology is described in mean error and modulus. Gauss’s
influence in on the combination of observations in astronomy and geodesy was
very strong. One of his followers was the astronomer Bessel
(see probable error).
In the early C20 Cambridge astronomers taught Gauss Mark I to Fisher
and Jeffreys amongst others. See Sheynin ch. 9 Life &
Work See Stigler (1986): chapter 4, The Gauss-Laplace
Synthesis and Hald (1998): Chapter 21, Gauss’s Theory of Linear
Unbiased Minimum Variance Estimation, 1823-1828.
period saw the emergence of the statistical society, which has been on
the stage ever since, although the meaning of “statistics” has changed and the
beginning of a philosophical literature on probability. It saw also the
beginning of the most glamorous branch of empirical time series analysis, the
Since the 1830s there have
been statistical societies, including the London
(Royal) Statistical Society Wikipedia
and the American
Statistical Association (now the world’s largest). The Statistical Society of Paris
was founded in 1860. The International
Statistical Institute Wikipedia
was founded in 1885 although there had been international congresses from 1853.
facts about human populations and in France André-Michel
Guerry mapped the moral statistics of the country. Quetelet
was a catalyst in the formation of the London Society but its intellectual
ancestors were not the mathematicians Laplace or Condorcet, but Graunt and, more
Sinclair, Arthur Young
M. Eden, who collected facts about society in the interests of
“improvement”. The facts were to complement, or perhaps be an antidote to, the
theoretical economics of the day; see the journal’s mission
& Work) later efforts to make record keeping and the
statistical analysis of those records part of hospital routine belonged to this
tradition. Among the founders of the society were the mathematician Babbage,
who also wrote on insurance and a produced a large work Economy of Machinery
& Manufacturers, the political economist “Population” Malthus, and the
known for his Law of Mortality. The technically most sophisticated work
presented to the society in its first decades was William Farr’s
analysis of vital statistics.
Probability publications do not appear in the Society’s journal until the 1880s
when Edgeworth started publishing. See Mathematics in the
London/Royal Statistical Society 1834-1934.
Since 1840, or so, there
has been a philosophical literature on probability. The English
literature begins with the extensive discussion of probability in John Stuart
Mill’s System of Logic (1843). This was followed by The
Logic of Chance (1862) of John
Venn, the Principles of Science (1873) of W. Stanley Jevons
and the Grammar of Science (1892) of Karl Pearson.
The American scientist/philosopher C.S.
wrote extensively on probability, although he was not much read. There was an
overlapping literature on logic and probability.
Morgan can be placed here as well as Boole
(LP) whose An Investigation into the
Laws of Thought, on which are founded the Mathematical Theories of Logic and
Probabilities (1854) contained a long discussion of probability. Later
figures are mentioned below. There were German and French
literatures as well but philosophical probability was less international than
mathematical probability. See Porter’s Rise of Statistical
In 1843 Schwabe observed that sunspot activity is periodic. There followed
decades of research, not only in solar physics but in terrestrial magnetism,
meteorology and even economics examining series to see if their periodicity
matched that of the sunspots. Even before the sunspot craze there was intense
interest in periodicity in meteorology, tidology, and other branches of observational physics and, by
the end of the century seismology, was becoming important. Both Laplace and Quetelet had analysed meteorological
data and Herschel had written a book on the
subject. The techniques in use varied from the simple, such as the Buys
Ballot table, to the sophisticated, forms of harmonic
At the end of the century the physicist Arthur Schuster
introduced the periodogram.
However, by then a rival form of time series analysis, based on correlation
and promoted by Pearson, Yule, Hooker
and others, was
taking shape. For an account see J. L. Klein (1997) Statistical
Visions in Time, Cambridge:
Cambridge University Press.
See I. Hacking The Taming of Chance (1990) and T.
M. Porter The Rise of Statistical Thinking 1820-1900 (1986).
Lambert Adolphe Jacques Quetelet (1796-1874)
Astronomer and statistician. MacTutor
SC, LP, ESM. Adolphe Quetelet studied at the University of Ghent
but his career was centred on Brussels.
His great energies were first concentrated on establishing an observatory
there. In 1824 he went to Paris
for three months to learn about such things and met the great French
scientists; Quetelet seems to have learnt about probability from Fourier. He returned an enthusiast for probability and
proposed improvements in census taking.
His book introducing “social
physics”, Sur l'homme et le
developpement de ses facultés, essai d'une physique sociale (1835),
introduced the “average man” (l'homme moyen). His very widely read Letters
on Probability (1846) publicised the use of the normal distribution, not
as an error law, but for describing the distribution of measurements.
Quetelet was a tireless scientific entrepreneur both at home and
internationally, e.g. he played an important part in establishing the London
Statistical Society (above). His work was not well
received by the social scientists of the C19, although his initiative was
admired by the economist Stanley Jevons
and Galton continued his work. In the early C20 J. M.
Keynes (Treatise on Probability) wrote of Quetelet, “There is scarcely
any permanent, accurate contribution to knowledge, which can be associated
with his name. But suggestions, projects, far-reaching ideas he could both
conceive and express, and he has a very fair claim, I think, to be regarded
as the parent of modern statistical method.” See Life &
Work. Coven’s A History of
Statistics in the Social Sciences is a study of Quetelet. See Stigler
(1986): Chapter 5, Quetelet’s Two Attempts. Hald (1998) 26.3. Quetelet
on the Average Man, 1835, and on the Variation around the Average, 1846.
important applied fields opened up in this period. Probability found a
major new application in physical science, to the theory of gases, which
developed into statistical mechanics. Problems in statistical mechanics were behind many
of the probability advances of the early C20. The statistical study of heredity developed into
biometry and many of the advances in statistical theory
were associated with this subject. There were important geographical
changes, as important work in probability
started to come from Russia
and important statistical work from England.
In 1860 James
Clerk Maxwell used the error curve (normal distribution) in the
theory of gases; he seems to have been influenced by Quetelet via
Herschel’s review of the Letters on Probability. Boltzmann and Gibbs
developed the theory of gases into statistical
inaugurated the statistical study of heredity,
work continued way into the C20 by Pearson and Fisher.
Correlation was the most distinctive contribution of this “English” school. See
Stigler (1986): Part III, A Breakthrough in Studies of Heredity.
By contrast, the so-called
“continental direction” investigated the appropriateness of simple urn models
for treating birth and death rates by considering the stability of the series
of rates over time. Wilhelm Lexis Theorie der Massenerscheinungen
in der menschlichen Gesellschaft (1877). Bortkiewicz Markov Chuprov and Anderson all worked in this tradition. See Stigler
(1986): Chapter 6, Attempts to revive the Binomial, C. C. Heyde & E.
Seneta I. J. Bienaymé: Statistical Theory Anticipated, 1977 and Sheynin ch. 15.1.
‘Higher’ statistics entered
psychology and economics.
For psychology see Fechner. In economics W. Stanley Jevons
(SC) saw himself as continuing the work of the political
arithmeticians of 1650+. In the intervening two centuries
much had been done and Jevons’s work on index numbers was
inspired by the theory of errors, while his research on economic time series
was inspired by the work of meteorologists on seasonal variation and of
physicists on the solar cycle and its terrestrial correlates. (see above) Jevons also tried to link his mathematical economic
theory (see utility)
to statistical analysis—a project revived in the econometrics of the C20.
See Stigler (1986) and T. M.
Porter The Rise of Statistical Thinking 1820-1900 (1986).
Ludwig Boltzmann (1844-1906)
LP. Boltzmann, with Gibbs,
was responsible for transforming Maxwell’s
probabilistic theory of gases into statistical mechanics. Boltzmann was
awarded a doctorate from the University
of Vienna in 1866 for a
thesis on the kinetic theory of gases. He had appointments at the
universities of Graz, Leipzig
as well as Vienna.
Statistical mechanics required solutions to problems in distribution theory
and also generated conceptual problems. Boltzmann gave the χ2
distribution for 2 and 3 degrees of freedom (1878) and for n (1881). Ernst
Abbe LP a physicist working in the theory of errors
had already obtained the distribution in 1862 and Pearson
was to obtain it again 1900. One of the conceptual innovations was entropy. Zermelo
argued that the irreversibility in Boltzmann’s thermodynamics
contradicted the recurrence properties of dynamic systems. Boltzmann’s
Ehrenfest devised the Ehrenfest urn model to
show that the contradiction was only apparent. One of Ehrenfest’s students
whose 1930 paper on Brownian motion became part of the probability
literature. Boltzmann thought of
the proper average values to identify with macroscopic features as being
averages over time of quantities calculable from microscopic states. He
wished to identify the phase averages with such time averages In the 1930s Birkhoff
Neumann produced ergodic
theorems that addressed the problem. The Stanford Encyclopedia has 2
relevant articles: J. Ullfink “Boltzmann's
Work in Statistical Physics” and L. Sklar “Philosophy
of Statistical mechanics.”
See also von Plato passim.
Fechner (1801-1877) Wikipedia Physicist and psychologist. SC. Fechner went to the University of Leipzig
to study medicine but did not qualify as a doctor. He developed a
strong interest in the relationship between mind and matter and he switched
to physics; his impressive experimental work earned him a chair in physics in
1834. In the mid 50s he started the experiments that formed the basis of his Elemente
der Psychophysik (Elements of Psychophysics) (1860). In his psychophysics Fechner
emphasised the Weber-Fechner
law relating to sensation to stimulus. Stigler emphasises the
book’s contribution to experimental method and assesses the treatment of
experimental design as the “most comprehensive” before Fisher’s
Design of Experiments (1935). One of the techniques Fechner introduced
as an ancestor of probit
analysis. In all this work Fechner used probability ideas that he
had known from his work in physics. His second major work was the
posthumously published Kollektivmasslehre (1897). The latter
anticipates the ideas of von Mises on collectives. Hermann Ebbinghaus
was inspired by Fechner’s Psychophysics to start his own experimental
work on memory; these researches were published in his Über das Gedächtnis (1885). See Life
& Work (for both Fechner and Ebbinghaus) and Stigler
(1986): Chapter 5, Psychophysics as a Counterpoint. See also O.
Sheynin (2004) Fechner as a Statistician, British
Journal of Mathematical and Statistical Psychology, 57, 53-72.
Francis Galton (1822-1911) Man of science MacTutor
SC, LP, ESM. After studying mathematics at Cambridge University,
medicine in Birmingham and London Galton spent
some years exploring Africa; his eminence as
an African explorer and geographer led to his election to the Royal
Society in 1860. Galton became interested in the phenomena
of heredity in the 1860s and most of his contributions to statistics arose
out of that study. Apart from his books Hereditary
Genius (1869) and
he wrote many articles. His cousin Charles Darwin,
whom he advised on statistical matters, was an important influence, as was Quetelet, although he disagreed with him on many points. (Like
Galton was rich enough to be a gentleman scholar.) The normal distribution
played an important part in Galton’s work. He is most remembered for
introducing the methods of correlation
He often involved other, better, mathematicians, including George
Pearson, Edgeworth and Sheppard
(of Sheppard's corrections),
in his problems. His work on the lognormal
distribution and branching
processes came about from his posing problems to mathematician
friends. Galton contributed a large number of terms to statistics, including
many of those used in elementary statistics, e.g. ogive, percentile and inter-quartile range. Apart from his influence on biometry, Galton had a
strong influence on the development of psychology,
especially in Britain,
both because he wrote about psychology
and because the statistical tools he developed were taken up by psychologists
& Work. Much of Galton’s vast output is available on Gavan
Tredoux’s Francis Galton.
There are two recent biographies: N. W. Gillham A Life of Sir Francis
Galton: From African Exploration to the Birth of Eugenics (2002) and M.
G. Bulmer Francis Galton: Pioneer of Heredity and Biometry (2003). See
Stigler (1986): Chapter 8, The English Breakthrough: Galton.
1880-1900 In this period the English statistical school
took shape. Pearson was the
dominant force until Fisher displaced
him in the 1920s. The school dominated statistics until the Second World War.
T. Schweder’s Early Statistics in the Nordic Countries considers
why this did not happen in Scandinavia.
Galton introduced correlation and a
theory was rapidly developed by Pearson, Edgeworth, Sheppard and Yule. Correlation was major departure from the statistical
work of Laplace and Gauss, both as a technique and because of the applications
it made possible. It became widely used in biology, psychology and social
In economics Edgeworth developed
some of Jevons’s ideas, most
notably on index numbers. However, economic statistics in Britain was more closely tied to
official statistics or financial journalism and Newmarch (1820-82) and Giffen (1837-1910) were more representative than Jevons or
Edgeworth. In Italy
Vilfredo Pareto discovered a statistical
regularity in the distribution of income; see Pareto
F. Y. Edgeworth (1845-1926) Economist and
SC, LP, ESM. Edgeworth studied classics at
Trinity College Dublin and Balliol College Oxford. From around 1880 he
followed dual careers in economics and in statistics. Edgeworth seems to have
been self-taught in mathematics and he made a thorough study of the subject
and remained very well read. He began in statistics by subjecting the casual
statistical methods of Jevons to rigorous examination and started what turned
out be made a long involvement with index numbers (see
“Money” in his Papers
relating to Political Economy, vol. 2). However, most of his
extensive publications in statistical theory were not motivated by
economic applications, or direct applications of any kind. In 1892
Edgeworth, prompted by Galton, examined correlation and
methods of estimating correlation coefficients. Another concern, which led to
a stream of papers, was with generalisations of the normal distribution, as
in e.g. his 1905 paper “The law of error”. The Edgeworth expansions
that came from this research are now associated with distributions of
estimators and test statistics but Edgeworth originally envisaged these
distributions used for data distributions, as an alternative to the Pearson curves.
Edgeworth’s starting point was Laplace. Much of his work
was not followed up, like his 1908/9 papers “On the probable errors of
frequency-constants” which anticipated some of Fisher’s large sample theory for maximum likelihood.
Unlike his contemporaries, Pearson in statistics and Alfred Marshall in
economics, Edgeworth founded no school. From 1891 he was professor of
political economy at Oxford.
He had no students in Statistics and his only follower was Arthur Bowley,
whose reputation rests on work in economic statistics and social surveys. See
& Work and Francis
Ysidro Edgeworth. See Stigler (1986):
Chapter 9, The Next Generation Edgeworth. His statistics papers are
collected in the 3-volume set F. Y. Edgeworth, Writings in Probability,
Statistics, and Economics edited by Charles Robert McCann, Jr.
Karl Pearson (1857-1936)
Biometrician, statistician & applied mathematician. MacTutor
SC, LP, ESM. Karl Pearson read mathematics at Cambridge but made his career at University
College London. Pearson was an established applied mathematician when
he joined the zoologist W. F. R.
Weldon and launched what became known as biometry; this found
institutional expression in 1901 with the journal Biometrika.
Weldon had come to the view that “the problem of animal evolution is
essentially a statistical problem” and was applying Galton’s statistical methods. Pearson’s contribution consisted
of new techniques and eventually a new theory of statistics based on the Pearson curves, correlation, the method of
moments and the chi square test. Pearson
was eager that his statistical approach be adopted in other fields and
amongst his followers was the medical statistician Major Greenwood.
Pearson created a very powerful school and for decades his department was the
only place to learn statistics. Yule, Irwin, Wishart
N. David were among the distinguished statisticians who started their
careers working for Pearson. Among those who attended his lectures were the
Pearl, the economist H. L. Moore,
the medical statistician Austin Bradford
Hill and Jerzy Neyman;
in the 1930s Wilks
was a visitor to the department. In France Lucien March
was a follower. Pearson’s influence extended to Russia where Slutsky
(see minimum chi-squared method)
were interested in his work. Pearson had a great influence on the language
and notation of statistics and his name often appears on the Words
pages and Symbols
pages—see e.g. population,
histogram and standard deviation. When
Pearson retired, his son E.
S. Pearson inherited the statistics part of his father’s empire—the
eugenics part went to R. A. Fisher. Under ESP (who retired
in 1961) and his successors the department continued to be a major centre for
statistics in Britain.
M. S. Bartlett went
there as a lecturer after graduating from Cambridge in 1933 (his teacher was Wishart)
and again as a professor when ESP retired. For more on KP see Karl Pearson: A Reader’s Guide. See Stigler
(1986): Chapter 10, Pearson and Yule.
the years before the Great War of 1914-18 probability and statistics were
expanding in all directions. During the war research in statistics and
probability almost stopped as people went into the armed services or did other
kind of war work. Pearson, Lévy and Wiener worked in ballistics, Jeffreys in meteorology and Yule in
administration. For the mathematicians’ traditional role in war, see The Geometry of War.
Hilbert proposed a set of problems
for the C20. The 6th was, “to treat … by means of axioms,
those physical sciences in which mathematics plays an important part; in the
first rank are the theory of probabilities and mechanics.” Measure
theory which would have a key
role in the axiomatisation of probability was being created by Borel, Lebesgue
and others—see below.
From different subjects
came contributions that eventually found a place in the theory of stochastic
processes. In physics Einstein
History of Noise) worked on Brownian motion. Bachelier
& Taqqu) developed a similar model applied to financial
speculation—that application was a sleeper until the 1970s.
The actuary Lundberg developed a theory of collective risk. Malaria and the migration of mosquitoes were
behind Pearson’s interest in the random walk problem.
Mathematical models of epidemics were developed
by Ronald Ross
and A. G.
without reference to the earlier work of Daniel Bernoulli.
Mendel did not use probability in his work on genetics
(published 1866) but his ideas were
probabilised as Pearson, Yule and Fisher
investigated how far his principles could rationalise the findings of the
Correlation began to be
important in psychology, largely through Charles Spearman
(1863-1945). Amongst his contributions to statistics were rank correlation and factor analysis. Godfrey Thomson
was a severe critic of Spearman’s factor analysis of intelligence. In the 1930s
Thurstone developed a multiple factor analysis.
especially in the United
States, quantitative methods become more
prominent. The most important figures were Warren Persons, Irving Fisher,
Mitchell and H. L. Moore.
Most of their work would now be classified as time series analysis.
applications of probability begin
work on congestion in telephone systems, the ancestor of modern queuing theory
developments include the creation in 1911 of the Department of Applied Statistics at UCL
headed by Pearson. Also in London,
at the London School of Economics, Bowley
became the first (full-time) Professor of Statistics in Britain. At Cambridge a University Lectureship in Statistics was
created in 1912. Yule, who got the job, might be
called the first modern statistician—his expertise was in statistics
(rather than in mathematics more broadly or in science like astronomy or
biology) and he applied this expertise to anything that interested him.
See Hald (1998, Part IV)
and von Plato (ch. 3) “Probabilities in Statistical Physics.” For
developments in economics see M. S. Morgan A History of Econometric Ideas,
G. Udny Yule (1871-1951) Statistician. MacTutor
Wikipedia SC, LP, ESM. After
training as an engineer at University College London and studying in Germany,
Yule returned to work for Karl Pearson in 1893. He was
soon contributing to the theory of correlation and regression and after
1900 he developed a parallel theory of association for
attributes. Yule applied Pearson’s statistical techniques to social problems—see e.g. the entry on Pearson curves—and,
with Edgeworth and Bowley,
he was one of the few members of the Royal Statistical Society
interested in mathematical statistics. Yule had broad interests and
his collaborators included the agricultural meteorologist R. H. Hooker,
the medical statistician Major Greenwood
(see negative binomial
distribution for their study of the incidence of disease and
accidents) and the agriculturalist (Sir) Frank Engledow. Yule’s sympathy
towards the newly rediscovered Mendelian theory of genetics led to several
papers; one involved developing the minimum chi-squared method
of estimation. In the 1920s he wrote
important papers on time
series analysis: “On the time-correlation problem” (1921) was a
critique of the variate
difference method; “Why Do We Sometimes Get Nonsense Correlations
between Time-series?” (1926) investigated a form of spurious correlation;
“On a Method of Investigating Periodicities in Disturbed Series, with Special
Reference to Wolfer's Sunspot Numbers” (1927) used an autoregressive model
in place of the usual periodic trend of harmonic analysis; see
analysis and Yule-Walker
equations. (above) Yule taught at Cambridge
for nearly 20 years yet he seems to have had little influence on the students
there. His successor, John
Wishart, had more impact. Although Cambridge
was the outstanding centre in Britain
for training mathematicians, statistics was slow in becoming established
there: see Cambridge
history. Yule’s Introduction to the Theory of Statistics
(1910) was influential and widely-used, especially after it was updated by Maurice
Kendall in 1937. See Stigler
(1986): Chapter 10, Pearson and Yule.
A. A. Markov (1856-1922) Mathematician. MacTutor
SC, LP. Markov spent his working life at
the University of St. Petersburg.
Markov was, with Lyapunov,
the most distinguished of Chebyshev’s students in probability.
Markov contributed to established topics such as the central limit theorem
and the law of large numbers.
It was the extension of the latter to dependent variables that led him to
introduce the Markov chain.
He showed how Chebyshev’s
inequality could be applied to the case of dependent random
variables. In statistics he analysed the alternation of vowels and consonants
as a two-state Markov chain and did work in dispersion theory. Markov had a
low opinion of the contemporary work of Pearson,
an opinion not shared by his younger compatriots Chuprov
Markov’s Theory of Probability was an influential textbook. Markov
influenced later figures in the Russian tradition including Bernstein
and Neyman. The latter indirectly paid tribute to Markov’s
textbook when he coined the term Markoff theorem for the result Gauss had obtained in 1821; it is now known as the Gauss-Markov theorem.
J. V. Uspensky’s Introduction to Mathematical Probability (1937) put
Markov’s ideas to an American audience. See Life
& Work There is an interesting volume of letters, The
Correspondence between A.A. Markov and A.A. Chuprov on the Theory of
Probability and Mathematical Statistics ed. Kh.O. Ondar (1981, Springer) See also Sheynin ch. 14 and G. P. Basharin
et al. The
Life and Work of A. A. Markov.
‘Student’ = William Sealy Gosset (1876-1937)
Chemist, brewer and statistician. MacTutor
SC, LP. Gosset
was an Oxford-educated chemist who worked not in a university but for
Guinness, the Dublin
brewer. Gosset taught himself the theory of errors from the textbooks by Mansfield Merriman
and Airy. His career as a publishing statistician began after he studied for
a year with Karl Pearson. In his first published paper
‘Student’ (as he called himself) rediscovered the Poisson distribution.
In 1908 he published two papers on small sample distributions, one on the
normal mean (see Student's t
distribution and Studentization)
and one on normal correlation (see Fisher’s z-transformation).
Although Gosset’s fame rests on the normal mean work, he wrote on other
topics, e.g. he proposed the variate
difference method to deal with spurious correlation.
His work for Guinness and the farms that supplied it led to work on
agricultural experiments. When his friend Fisher made randomization central
to the design of experiments Gosset disagreed—see his review
of Fisher’s Statistical Methods. Gosset was not very interested in
and the biometricians were not very interested in what he did; the normal
mean problem belonged to the theory of errors and was more closely related to
Gauss and to Helmert
than to Pearson. Gosset was a marginal figure until Fisher
built on his small-sample work and transformed him into a major figure in C20
statistics; for his relations with Fisher, see Fisher
Guide. Another admirer
S. Pearson. For a sample of Gosset’s humour see the entry kurtosis. See Life
1920-1930 Many of the people who would dominate probability and
statistics over the following decades, first made an impact. Of them, the
individual who had the greatest hold over his subject was Fisher
in statistics. The ascendancy of Fisher was also the ascendancy of the
English language. While German was the international scientific language of the
time—and the language of probability—Fisher and his followers rarely referred
to literature in German, believing that that literature ended with Gauss, although Helmert
was later added to the canon. Thus standard works, like those by Czuber,
did not cross the Channel.
In probability advances included refinements of the central limit theorem
made an important contribution) and the strong law of large numbers
(which went back to Borel
in 1909 and Cantelli
in 1917) and new results including the law of the iterated logarithm.
There were contributions from most countries of Continental Europe, e.g. Mazurkiewicz
from Poland, Onicescu from
Romania and Hostinsky
however, there was much less interest in probability: see England and
Continental Probability in the Inter-War Years and the remarkable case of Alan
Turing who repeated Lindeberg’s work, being unaware of it.
Nevertheless the examiners, Fisher and Abram
Besicovitch who had studied with Markov, were
impressed by his work and Turing was elected a fellow of his college.
foundations of probability received much
attention and certain positions found classic expression: the logical
interpretation of probability (degree of reasonable belief) was propounded by
the Cambridge philosophers, W.
E. Johnson, J.
M. Keynes and C.
D. Broad, and presented to a scientific audience by Jeffreys; Keynes was influenced by the German
physiologist/philosopher J. von Kries.
view was developed by von Mises.
The Modern (Evolutionary)
Synthesis of Mendelian genetics
and Darwinian natural selection involved the solution of problems involving
stochastic processes, e.g. branching
processes. However, the work
did not have as much influence on the development of probability theory as
similar work in physics; see Fokker-Planck
principal contributors to the modern synthesis,
Fisher, J. B. S. Haldane
analysis), all contributed to statistics, but Fisher was in a class
R. A. Fisher generated many new ideas on estimation and hypothesis testing and
his work on the design of
experiments moved that topic from the fringes of statistics to the centre. His Statistical Methods for
Research Workers (1925)
was the most influential statistics book of the century.
A. Shewhart ASQ Wikipedia
pioneered quality control, which
became a major industrial application of statistics.
(1998, ch. 27 and passim) and von Plato (ch. 4-6)
By permission of Fisher Memorial Trust
R. A. Fisher
(1890-1962). Statistician and geneticist. MacTutor
SC, LP, ESM. Fisher was the most influential
statistician of the C20. Like Pearson, Fisher, studied
mathematics at Cambridge
University. He first
made an impact when he derived the exact distribution of the correlation
coefficient (see Fisher’s
z-transformation). Although the correlation coefficient was a
cornerstone of Pearsonian biometry, Fisher
worked to synthesise biometry and Mendelian genetics; for Fisher’s many disagreements with Pearson, see
Pearson in A
Guide to R. A. Fisher. In 1919 Fisher joined Rothamsted
Experimental Station and made it the world centre for statistical
research. His subsequent more prestigious appointments in genetics at UCL and Cambridge
proved less satisfying. The estimation
theory Fisher developed from 1920 emphasised maximum likelihood and
was founded on likelihood
He rejected Bayesian
methods as based on the unacceptable principle of indifference;
In the 1930s Fisher developed a conditional inference approach to estimation
based on the concept of ancillarity.
His most widely read work Statistical
Methods for Research Workers (1925 + later editions) was largely concerned with tests of significance: see Student's t distribution,
chi square, z and z-distribution
and p-value. The
book also publicised the analysis of
The Design of Experiments (1935 + later editions) put that
subject at the heart of statistics (see randomization, replication blocking). The fiducial argument,
which Fisher produced in 1930, generated much controversy and did not survive
the death of its creator. Fisher created many terms in everyday use, e.g. statistic and sampling distribution
and so there are many references to his work on the Words pages. See Symbols in Statistics
for his contributions to notation. Fisher influenced statisticians mainly
through his writing—see the experience of Bose and Youden.
Among those who worked with him at Rothamsted were Irwin Wishart,
(colleagues) and Hotelling
(‘voluntary worker’) Speed
Hotelling lecture MGP.
Fisher made several visits to the US where Hotelling and Snedecor
were important contacts. In London
and Cambridge Fisher was not in a Statistics department and Rao
was his only PhD student in Statistics. For more information see A
Guide to R. A. Fisher. See Hald (1998, ch. 28 Fisher’s
Theory of Estimation 1912-1935 and his Immediate Precursors).
Richard von Mises (1883-1953) Applied
SC, LP. Mises was educated at the
Technische Hochschule in Vienna.
From 1919 he was director of the Institute
of Applied Mathematics at the University of Berlin. He used probability in his
work in physics, e.g. von
Mises distribution, and wrote on mainstream probability topics,
e.g. central limit theorem,
but he is most famous for his work on the foundations of probability. In 1919
he published his “Grundlagen
der Wahrscheinlichkeitsrechnung,” (p. 52) which expounded his frequentist
interpretation of probability, based on the notion of a collective. The paper contained
other innovations, including the “label space” (sample space) and the distribution function.
At the time the Mathematische
Zeitschrift, which published these early papers, was the most
important German outlet for work in probability. Von Mises published two books on
probability, the widely read Probability, Statistics and Truth (1928)
and a comprehensive textbook (1931). His position on statistical inference
was—surprisingly—Bayesian. In 1933 he left Germany
for Turkey and, in 1938,
moved again to the United
States, where he became Professor of
Aerodynamics and Applied Mathematics at Harvard. Mises influenced many
writers on probability in the 20s and 30s, including Kolmogorov.
Among those who worked to make the collective rigorous in the 1930s were Wald and the logician Alonzo
Geiringer has written a history of probability from a
Misean standpoint, see Probability: Objective Theory. See also
von Plato (ch. 6) Von Mises’ frequentist probabilities.
Harold Jeffreys (1891-89) Applied mathematician and physicist. MacTutor
SC, LP. Jeffreys has a good claim to be
considered the first Bayesian statistician in that he used only Bayesian methods. Jeffreys
arrived to study mathematics at Cambridge University
a year after Fisher and he spent his life there
working on astronomy and geophysics. Unlike von Mises,
Jeffreys was not primarily interested in probability as a means of modelling
physical processes but in probability in relation to scientific inference.
Wrinch, he produced a series of papers between 1919 and –23. (See the
and posterior probability.)
They were influenced by the approach to probability taken by the Cambridge philosophers W.
E. Johnson, J.
M. Keynes and C.
D. Broad; all would now be described as Bayesians. From his
earliest work Jeffreys had used least squares (which he had learnt from the
in his empirical work but around 1930 he started to devise new methods and to
reconstruct the old in accordance with his theory of probability. He did
extensive empirical work, the best known being in collaboration with K E
Bullen on earthquake travel times. Jeffreys also studied Fisher’s statistical work and
adopted some of his concepts and terminology, e.g. likelihood. Jeffreys’s big
book Theory of Probability (1939) combined a philosophy of probability
with a reworking of the “modern statistics” of Fisher and Pearson—all
founded on the principles of inverse
probability. In 1946 Jeffreys completed his system by providing a
rule for choosing priors—see Jeffreys
prior. Statisticians (including those who attended his lectures as
students!) showed little interest in Jeffreys’s work until the Bayesian revival of the
1960s. The physicist E. T.
Jaynes (1922-98) was strongly influenced by him. See Symbols in Probability
for Jeffreys’s contribution to notation. See Life &
Work. For more information see
Jeffreys as Statistician.
Norbert Wiener (1894-1964) Mathematician.
SC, LP. Wiener’s working life was spent
at MIT. He was well-travelled, having studied at Harvard, Cambridge University
and Göttingen. He studied mathematical logic at Cambridge
but he was to become closer to the Cambridge
analysts especially Hardy
with whom he collaborated. Among contemporary probabilists his strongest
links were with Lévy. Wiener earliest work on probability
treated Brownian motion
(see also Wiener process)
where he used the new Daniell
integral; see here
for a detailed account of his relations with Daniell. In 1930 Wiener
presented a generalized harmonic
analysis, which had a mathematical model of the spectrum, and
developed the periodogram
analysis introduced by the physicist Arthur Schuster at the end of the
C19. (above) Much of Wiener’s work ran parallel to that of
the Russian probabilists Khinchin and Kolmogorov
but the relationship between their work and his did not emerge until later.
Wiener worked with engineers, in particular with Y. K. Lee; see The
Lee-Wiener Legacy. In the Second World War Wiener
developed a theory of prediction to be used in fire control systems. His
wartime report was published as Extrapolation, Interpolation and Smoothing
of Stationary Time Series (1949) (see filter and autocorrelation).
Wiener devised the subject of cybernetics
as an umbrella to cover his various interests. P. R.
Masani’s biography Norbert Wiener
(1990) is reviewed in Mathematical
Reviews. Wiener’s papers are at MIT.
Yakovlevich Khinchin (1894-1959)
Khinchin was a student at Moscow State University
and spent almost all his working life there. Khinchin, like Lévy
and Doob, started in analysis. The university had a very
strong analysis group and Khinchin’s supervisor was Luzin.
There was no tradition of work in probability until, that is, Khinchin and Kolmogorov created one. There do not seem to have been any
personal links with the Chebyshev/Markov
tradition at St. Petersburg.
Khinchin was drawn into probability through an interest in the theory of
numbers. The law of the
iterated logarithm (1924) had an existence in number theory, as
did the strong law of large
numbers. While in the 20s Khinchin worked on sequences of
independent random variables, in the 30s he developed the theory of stochastic processes
and, in particular, that of stationary
processes. In the 1940s he applied his probabilistic techniques
to the theory of statistical mechanics in a book Mathematical Principles
of Statistical Mechanics (1943). In the 50s Shannon's
(1948) was generating much more interest in probability and statistics
circles than had earlier work on communication. Khinchin contributed to the absorption of
these concepts with his Mathematical Foundations of Information Theory.
a calamitous economic and political background there
were important developments in probability, statistical theory and applications.
In the Soviet Union mathematicians fared better than economists or geneticists
and in the early years they could travel abroad and publish in foreign
journals; thus Kolmogorov and Khinchin
published in the main German periodical, Mathematische
In Germany Jews were barred from academic jobs from 1934.
the main developments were Kolmogorov’s axiomatisation of probability and the
development of a general theory of stochastic
processes by him and Khinchin. This work is
usually seen as marking the beginning of modern probability. See von Plato (ch. 7) “Kolmogorov’s measure
theoretic probabilities.” Most of the activity was in the Soviet Union and France but the United States began to play a
bigger role in the course of the decade.
In the foundations of probability Bruno de
Finetti and Frank Ramsey’s (1903-1930) (St.
Sahlin) work on subjective probability appeared. Ramsey started by
criticising the Cambridge
logical school (see Jeffreys), in particular Keynes. A
statistical superstructure came only later. Jeffreys gave a complete treatment of statistics founded on his
logical notion of probability but otherwise the prevailing approach was classical.
In Britain and the United States statistics was redefined. The Royal Statistical
Society (above) broadened its political arithmetic and
‘state-istics’ agenda and welcomed work on agriculture and industry and on mathematical statistics. There were similar
changes in the American Statistical Association (above). Biometrika
stopped publishing biological research and focussed on theoretical statistics.
The Institute of Mathematical
Statistics was founded in 1930 and its journal The Annals of Mathematical
Statistics appeared in 1933. This became a major journal for
both mathematical statistics and probability. The first statistics laboratory
in the US was created at Iowa State
in 1933. Snedecor was strongly influenced by Fisher. In France Georges Darmois
and Daniel Dugué came under Fisher’s influence. Georg Rasch took
Fisher’s ideas to Denmark.
inference the main development was the Neyman-Pearson theory of hypothesis testing from
1933 onwards. Multivariate analysis
became an identifiable subject, formed out of such contributions as the Wishart distribution
components (1933) and canonical
correlation (1936) and Fisher’s discriminant analysis
Applications of mathematics
and statistics to economics came together in the
movement. This could look back to the C17 political arithmetic and the C19 work
on index numbers
and on Pareto's law
but econometric modelling, which involved the application of regression methods to
economic data, was a C20 development. Among the leaders in the 1930s were Jan
Tinbergen and Ragnar
Frisch. Econometricians who have been followed them as Nobel laureates
in economics include Engle, Granger, Haavelmo, Heckman, Klein, McFadden. Equally important were developments in the
collection of economic information. In the United States the outstanding
economic statistician was Simon
Kuznets. See M. S. Morgan A History of Econometric Ideas, Cambridge 1990.
Jerzy Neyman (1894-1981) Statistician. MacTutor
SC, LP, ESM. Neyman was educated in the tradition of Russian
probability theory and had a strong interest in pure mathematics. His
probability teacher at Kharkov
University was S.
N. Bernstein. Like many, Neyman went into
statistics to get a job, finding one at the National Institute for
Agriculture in Warsaw.
He appeared on the British statistical scene in 1925 when he
went on a fellowship to Pearson’s laboratory. He began to
collaborate with Pearson’s son Egon
Pearson and they developed an approach to hypothesis testing,
which became the standard classical approach. Their first work was on the likelihood ratio test
(1928) but from 1933 they presented a general theory of testing, featuring such
characteristic concepts as size,
power, Type I error, critical region and,
of course, the Neyman-Pearson
lemma. More of a solo project was estimation, in
particular, the theory of confidence
intervals. In Poland Neyman worked on agricultural experiments
and he also contributed to sample survey theory (see stratified sampling
and Neyman allocation).
At first Neyman had good relations with Fisher but their
relations began to deteriorate in 1935; see Neyman in A
Guide to R. A. Fisher. From the late
1930s Neyman emphasised his commitment to the classical approach to
statistical inference. Neyman had moved from Poland
to Egon Pearson’s department at UCL in 1934 but in 1938 he moved to the University
of California, Berkeley. There he built a very strong group which
included such notable figures as David
Blackwell, J. L.
Hodges, Erich Lehmann,
Le Cam (memorial
Scheffé and Elizabeth
is nicely evoked in Lehmann’s Reminiscences
of a Statistician Amazon.
Harald Cramér (1893-1985) Mathematician,
statistician & actuary. MacTutor
Photos SC, LP. Personal
Science 1986. Cramér
studied at the University
of Stockholm and spent
his working life there. His career spanned the applied mathematics of
insurance and the pure mathematics of number theory. In Sweden
probability, statistics and actuarial science were more closely related than
elsewhere; see Cramér’s talk Actuaries
and Actuarial Science. Filip
Lundberg was a symbol of
the link between insurance and probability and the Skandinarvisk Aktuarietidskrift was the main statistics journal. From the mid-20s probability became increasingly prominent in
Cramér’s research. In 1929 a chair in “Actuarial Mathematics and Mathematical
Statistics” was created for him. Cramér’s Random Variables and
Probability Distributions (1937) has been called “the first modern book
on probability in English”; he was encouraged to write it by G.
H. Hardy the British number theorist and analyst. Cramér’s early work was on the central limit theorem,
and treated the expansions associated with Edgeworth (Edgeworth series), Gram
and the astronomer Charlier.
Cramér was an important synthesiser of
subjects and of national traditions. His student Herman Wold
brought together the individual processes, studied by Yule
in the English statistical literature, and the theory of stationary stochastic processes,
studied by Khinchin in the Russian mathematical
literature. Cramér’s own Mathematical
Methods of Statistics (1945) brought together English statistical
theory and Continental probability. Amongst the new results it contained was
inequality. Cramér’s main work from 1940 onwards was on stochastic
processes, where he extended the
theories of Kolmogorov and Khinchin. Cramér made an important contribution
to the anglicising of probability language and so his name often
appears on the Words pages. See also Symbols in Probability.
Finetti (1906-85) Mathematician, actuary & statistician. MacTutor
LP. De Finetti studied mathematics in Milan. He was very precocious and
very prolific, publishing the first of his 300 works while still a student.
Although de Finetti became the best known of the Italian probabilists, there
was already an Italian presence on the international scene. Castelnuovo’s
textbook, Calcolo della probabilità (1919), was comparable to Markov’s and Cantelli’s
(LP) work on the strong law of large numbers made
him a pioneer of modern probability. The journal which Cantelli founded, Giornale
dell'Istituto Italiano degli Attuari, published important probability
contributions in the 1930s; for Cantelli see Regazzini. Corrado Gini (LP) (SC)
founded the Italian Central Statistical Institute, and also the journal Metron. De Finetti worked for a time at the
Statistical Institute and, as well as being associated with the universities
of Trieste and Rome, he did actuarial work. While de
Finetti made important contributions to probability theory, he is best known
for his subjective theory of probability, based on the Dutch book argument for coherence In the English
speaking world de Finetti’s work only became known in the 1950s when Savage drew attention to it. De Finetti’s work has attracted
a lot of attention from philosophers. See for example Richard Jeffrey and his book Subjective Probability.
See von Plato ch. 8: “De Finetti’s subjective probabilities.”
See the website Bruno de
Feller (1906-70) Mathematician. MacTutor
About half of Feller’s papers were in probability, the rest were in
calculus, functional analysis and geometry. After a first degree at the University of Zagreb
Feller went to the University of Göttingen,
where the world’s leading mathematics department was presided over by David
Hilbert, Feller’s ideal mathematician. Feller’s supervisor was Richard
Courant. Feller was awarded his Ph.D. in 1926, aged 20. In 1933 he
left Germany first for Denmark and then for Sweden, joining Cramér at the University of Stockholm.
He moved to the USA in
1939, first to Brown and then to Cornell and Princeton.
Feller’s first contribution to probability was a 1935 paper on the central limit theorem; he
obtained similar results to Lévy. Feller was the main
architect of renewal theory.
In the 50s he worked on a theory of diffusion, which brought together
functional analysis differential equations and probability. Feller had
numerous PhD students who became influential probabilists; one unofficial
student was Frank
Spitzer. The publication in 1950 of volume 1 of Feller’s Introduction
to Probability Theory and its Applications was a major event. Gian-Carlo
Rota wrote, “Together with Weber’s Algebra and Artin’s Geometric
Algebra this is the finest text book in mathematics in this century.”
Besides giving new results and new forms to old results, the book drew
attention to a vast body of applied probability work that had not been
noticed in the theoretical literature. Feller’s frequent appearance on the Symbols in Probability
and Words pages (e.g. sample
space and experiment)
testify to the influence of the book. See J. L. Doob “William Feller and
Twentieth Century Probability” in AMS History
of Mathematics, Volume 3 and von Plato (ch. 7)
“Kolmogorov’s measure theoretic probabilities.” There is an excellent
Feller (1906-1970) by Darko Zubrinic.
J. L. Doob
(1910-2004) Probabilist. MacTutor
Interview. Because Wiener was not part
of the probability community, Doob was the first “modern probabilist” from
the United States
or even from the English-speaking world. When Doob came on the scene the only
American probability textbook was the 1925 book by J.
L. Coolidge, which could almost have been written in 1885.
Harvard, where Doob studied, had a strong group of mathematicians but nobody
worked on probability. Statistics was much livelier and Doob came into
contact with probability and its European literature when he got a job with
the statistician Harold
Hotelling. Some of Doob’s early work (1934-6) was devoted to
making statistical theory more rigorous. His first idea for a topic for his
PhD student Paul
Halmos was that he should make some of Fisher’s
ideas rigorous. Halmos switched to a more realistic topic. Doob’s career was
almost entirely spent at the University
of Illinois. Although
Doob did not begin in probability, almost all of his work was in this field.
His main work was on stochastic
processes; he was responsible for making martingales so
prominent. His book Stochastic Processes (1954) was an important work of
synthesis. His Classical Potential Theory and its Probabilistic
Counterpart (1984) brought together Doob’s probabilistic and
non-probabilistic interests. Doob made an important contribution to
anglicising the language of probability theory and he appears often in this
capacity on the Words pages—see e.g. probability measure
and Markov process. Apart from Halmos, his best-known student
Blackwell who became part of Neyman’s group at Berkeley. For a
retrospective on Stochastic Processes see N. H. Bingham (2005) Doob: a
half-century on, Journal of Applied
Probability, 42, 257–266. D. Burkholder and P. Protter have
some personal reminiscences here. An issue of the JEHPS is devoted to “the splendours
and miseries of martingales.”
1940-1950 Among the millions who died in the Second World War
were mathematicians and statisticians. Doeblin
is only the best known of those killed; one of Neyman’s books
is dedicated to 10 lost colleagues and friends. Yet
this war, unlike the First World War, promoted the study of statistics and
probability. At the end of the war there were many more people working in
statistics, there were new applications and the importance of the subject to
society was more widely recognised.
The Nazi persecutions and
the Second World War drove many statisticians and mathematicians to the USA. There was
already a pattern of migrants seeking better opportunities. Many important
figures in post-war US probability and statistics, including Feller,
Wald, G. E. P. Box
G. Cochran (ASA)
O. Hartley (MGP),
F. J. Anscombe (Obit.
p. 17) (MGP),
W. Birnbaum (MGP)
were from Europe. From around 1950 Indian
statisticians, following the example of R. C. Bose, began migrating to the US. See Rao.
The war brought many people
into statistics and probability. Savage and Tukey
are examples from the US while in Britain the recruits included Barnard
G. Kendall (MGP),
The recruits were often better trained in abstract mathematics than earlier
statisticians. This contributed to closing the gap between the English
statistical and the Continental probability traditions.
The war generated research
problems out of which came Wiener’s
work on prediction and Wald’s on sequential analysis and
the new subject of operations
research. Governments’ need for information led to great expansion
in the production of official statistics. In Britain the leading figure in
economic statistics was Richard
Stone ET interview.
Between 1943 and –6 three
advanced treatises on statistics appeared, by Cramér, M.
G. Kendall and Wilks
These works did much to consolidate the subject and thereby professionalise it.
began to be systematically studied, using tools from the theory of statistical
inference; E. J. G.
Pitman was an important pioneer. The tests often came originally
from non-statisticians, like Spearman (rank correlation) or Wilcoxon
The existing repertoire of sign
test and Kolmogorov-Smirnov
test was soon expanded.
Modern time series analysis
came from the union of the theory of stochastic processes (see Khinchin
and Cramér), the
theory of prediction (Wiener and Kolmogorov)
and the theory of statistical inference (Fisher and Neyman) with harmonic analysis and correlation among the
grandparents. (above) One of the
main pioneers of the 40s was M. S. Bartlett.
In the 50s Tukey was a leading contributor, in the 60s Kalman (Kalman filter) and
systems engineers made important contributions and in the 70s the methods of G. E. P. Box Interview
and G. M. Jenkins
adopted in economics and business.
Abraham Wald (1902-1950)
LP. Wald studied at the University of Vienna with Karl
Menger writing a thesis and several articles on geometry. To make
a living Wald worked in economics, publishing on both mathematical economics
and on economic statistics. His first work in probability was on the
collective of von Mises.
When Wald moved to the USA
in 1938 he went to Columbia
University where Harold
Hotelling was the senior statistician.
Wald started working on statistics in 1939 and soon developed his first ideas
on decision theory,
which he saw as an extension of Neyman-Pearson testing
theory. During the war Wald worked on statistical problems for the
military—one result was the development of sequential analysis.
The Statistical Research Group at Columbia
University was one of
the most important wartime groups. After the war Wald developed his decision
theory ideas further in the book Statistical Decision Functions
(1950). In his short career Wald made contributions to most branches of statistical theory; among them the eponymous Wald test. He played
an important part in the statistical developments in econometrics, which led
to a Nobel prize for Trygve
Wald’s co-authors include Jacob
Wolfowitz and H. B. Mann.
Among his PhD students were Herman
Stein and Milton
Kiefer had started his PhD when Wald was killed in an air crash.
C. R. Rao (b. 1920) Statistician. MacTutor References
photo. Rao is the most distinguished member of the Indian statistical
school founded by P.
C. Mahalanobis and centred on the Indian Statistical Institute
and the journal Sankhya.
Rao’s first statistics teacher at the University of Calcutta
was R. C. Bose.
In 1941 Rao went to the ISI on a one-year training programme, beginning an
association that would last for over 30 years. (Other ISI notables were S. N. Roy in the
generation before Rao and D. Basu one
of Rao’s students.) Mahalanobis was a friend of Fisher and much of the early research at ISI was
closely related to Fisher’s work. Rao was sent to Cambridge to work as PhD student with
Fisher, although his main task seems to have been to look after Fisher’s
laboratory animals! In a remarkable paper written before he went to Cambridge, “Information and the accuracy attainable in
the estimation of statistical parameters,” Bull. Calcutta Math. Soc. (1945) 37,
81-91 Rao published the results now known as the Cramér-Rao inequality
and the Rao-Blackwell
theorem. A very influential contribution from his Cambridge period was the
score test (or Lagrange multiplier test),
which he proposed in 1948. Rao was influenced by Fisher but he was perhaps as
influenced as much by others, including Neyman. Rao has
been a prolific contributor to many branches of statistics as well as to the
branches of mathematics associated with statistics. He has written 14 books
and around 350 papers. Rao has been a very international statistician. He
worked with the Soviet mathematicians A. M. Kagan and Yu.
V. Linnik (LP) and since 1979 he has worked in the United States, first at the University of Pittsburgh
and then at Pennsylvania
He was elected to the UK
Society in 1967; in 2001 he received the Indian Padma Vibhushan
he received the US National Medal of Science in 2002. See ET Interview: C.
R. Rao, ISI
interview and profile.
For a general account of Statistics in India, see B. L. S. Prakasha
Statistics as a Discipline in India.
1950-1980 Expansion continued—more
fields, more people, more departments, more books, more journals! Computers began to have an impact—see below for more
Existing departments were expanded:
e.g. in 1949 a second chair of statistics was created at LSE filled by M.
G. Kendall. Bowley (above) only ever had a staff
of 2, E. C. Rhodes
and R. G. D. Allen. New institutions were created, e.g. the
Statistical Laboratory at Cambridge in 1947 and the Statistics Department
at Harvard in 1958.
The scope of probability theory increased with the emergence of new
sub-fields such as queueing
theory and renewal
theory. Feller’s Introduction to
Probability Theory made a very strong impact on in the English-speaking
world; it promoted the study of the subject and made advanced topics, like Markov chains,
In 1950 the
Carnap published a major work, Logical Foundations of
Probability which advanced a dual interpretation of probability, as degree
of confirmation, which looked back to Cambridge (see Jeffreys),
and as relative frequency, which looked back to von Mises.
Probability was an important topic for other philosophers of science, including
Popper. More recently philosophers have been attracted to the
monism of de Finetti’s subjectivism. Alan Hajek’s Interpretations
of Probability in the Stanford Encyclopedia of
Probability reviews the modern scene.
there was a Bayesian revival. In Britain I.
J. Good—Probability and the
Weighing of Evidence (1950)—was influenced (positively and negatively) by
logicians (see Jeffreys). The American version, Bayesian decision
theory, reflected more the influence of Wald’s
classical decision theory.
The most important early contributions were Savage’s Foundations
of Statistics (1954) and Howard Raiffa and Robert Schlaifer’s
Applied Statistical Decision Theory (1961).
Edwards Deming (ASQ)
(LP) continued Shewhart’s work on quality control and was
very effective in getting industry to adopt these methods.
There was a great expansion
in medical statistics and epidemiology. Austin Bradford
Hill was an important contributor to both fields: he pioneered randomised clinical trials and, in work
with Richard Doll,
demonstrated the connection between cigarette smoking and lung cancer.
and Quetelet saw the work of the census as a possible
application of probability but the use of statistical theory by official data
gatherers only became institutionalised through the activities of Morris Hansen
at the US Census Bureau.
the 50s finance
has been an important area for applied decision theory: the 1990 Nobel Prize in
economics was awarded to Markowitz
for work that was influenced by Savage although the idea of
expected utility goes back to Daniel Bernoulli. Since the 70s
finance has been an important area for applied stochastic processes. Ito had developed his stochastic calculus in
the 40s but it was applied in an unexpected way in the Black-Scholes model for
pricing derivatives. Scholes
received the 1997 Nobel Prize in economics for their contribution (see Black-Scholes formula).
The intellectual ancestor of stochastic finance was Bachelier (above).
In 1973 the Annals of
Mathematical Statistics (see above) split into the
Annals of Probability and the Annals of Statistics. This represented
increasing specialisation—there were weren’t many new Cramérs—as well as the need to expand journal pages.
Jimmie Savage (1917-71) Statistician. MacTutor
LP. After training
as a pure mathematician and obtaining a PhD from the University
of Michigan Savage worked briefly
Neumann in Princeton. Savage
became a statistician in the war, joining the Statistical Research Group at Columbia University, which included Wald. Savage’s earliest
publications were in the style of Wald’s
theory. The change came with the Foundations
of Statistics (1954). This work, written while Savage was at the University of Chicago, continued the decision theme
but provided the basis for a Bayesian
approach to probability in the spirit of Ramsey and de Finetti.
Savage also drew on the utility
theory von Neumann and Morgenstern developed for use in the theory of games. The
maxim of maximising expected utility went back to Daniel
Bernoulli. However the book’s main take on statistics was still classical and Savage criticised
the foundational work of Jeffreys
who had provided the only set of Bayesian methods that reflected C20
statistics. Schlaifer was quicker to develop practical Bayesian methods.
However, Savage became a strong advocate of Bayesian methods in
statistical research and a champion of such non-classical principles as the likelihood principle.
This principle was treated axiomatically by Allan Birnbaum
but it had been discussed earlier by Fisher and Barnard.
Jimmie’s younger brother, I. Richard Savage (1926-2004), was also a
distinguished statistician. (Obit. p. 8)
Instead of describing
people from the very recent past, I describe the effect the computer has had on statistics from its advent, around 1950 and the changes in the writing of the history
of probability and statistics in recent decades.
of the computer. The changes following the introduction of the computer have been much
more radical than those following the increased use of mechanical calculating
machines at the end of the C19. Such machines provided the material
basis for Pearson and Fisher’s research
and for the construction of their statistical
tables in the period1900-50. The machines were not in general use
and Fisher assumed that most of the users of the tables and the
“research workers” who read his book would use logarithm tables or a slide
rule. For general background see A Brief
History of Computing.
With the availability of
computers old activities took less time and new activities became possible.
Statistical tables and
tables of random numbers first became much easier to produce and then they
disappeared as their function was subsumed into statistical packages.
Much bigger data sets could
be assembled and analysed.
Exhaustive data-mining became
Much more complex models
and methods could be used. Methods have been designed with computer
implementation in mind—a good example is the family of Generalized
linear models linked to the program GLIM; see John Nelder FRS.
In the early C20 when Student (1908) wrote about the normal mean and Yule
(1926) about nonsense correlations they used sampling experiments and in the
1920s it became worthwhile to produce tables of random numbers. With the
introduction of computer-based methods for generating pseudo random numbers
much more ambitious Monte-Carlo
investigations (introduced by von
Neumann and Ulam)
became possible. The Monte-Carlo experiment became a standard way of
investigating the finite sample behaviour of statistical procedures.
Since around 1980 Monte Carlo methods have been used directly in data
analysis. In classical statistical inference the bootstrap has been very
prominent; Statistical Science’s silver
anniversary issue (Euclid)
includes an interview with Efron, the creator. In Bayesian analysis Markov Chain Monte-Carlo methods
have been used extensively; previously conjugate priors and noninformative priors
had been used because of computational limitations.
Writing history. In recent
decades there has been a flood of works—books and articles—on the history of
probability and statistics from statisticians, philosophers and historians. I
will mention a few titles in each category to indicate the range of activity.
50 years ago the standard
general works were Todhunter for the history of probability,
for statistics with an emphasis on psychology and education and Westergaard
for statistics with an emphasis on economic and vital statistics.
Walker (1929) Studies in the History of Statistical Method, Baltimore: Williams &
Westergaard (1932) Contributions to the History of Statistics, London: King.
S Pearson got history moving in Britain. In 1955 Biometrika published the first of its Studies in the History of Statistics and Probability; as editor, ESP, wrote “It is hoped to
publish articles by a number of different authors under this general heading.”
So far, around 50 articles have appeared. ESP wrote about the people he
knew—his father, Fisher and Student—but
the other pioneers, F
N David and M
G Kendall, chose more remote topics. Two collections have appeared, Studies in the History of Statistics and Probability (1970) and Studies
in the History of Statistics and Probability, Volume II, (1977). See here
for a list of the articles. In addition, David wrote a book on the early
history of statistics and Pearson edited and published his father, Karl’s,
lectures; both works are very different from Todhunter.
F. N. David (1962) Games, Gods and Gambling:
the Origins and History of Probability and Statistical Ideas from the Earliest Times
to the Newtonian Era. Griffin, London.
E. S. Pearson
(ed) (1978) The History of Statistics in the 17th and 18th Centuries against
the Changing Background of Intellectual, Scientific and Religious Thought:
Lectures by Karl Pearson given at University College, 1921-1933. London: Griffin.
Oscar Sheynin, Anders
Hald and Stephen Stigler
(see above) have been the leading contributors to the
technical literature. Sheynin has published many articles, mainly in the
for History of Exact Sciences. There is a list of Hald’s
history writings here.
Some of Stigler’s articles are reprinted in
Stigler (1999) Statistics on the Table: The History of Statistical Concepts
and Methods, Cambridge, MA:
Harvard University Press.
neglected work has been rediscovered. Bienaymé’s
(LP SC) name is now linked to branching processes and Chebyshev's inequality
and his total contribution is recognised. There
are cases of individuals, famous for other work, e.g. Abbe
(LP) or Einstein, (Brillinger
Time Series) and of individuals who may be known for one
topic but who produced a large body of work, e.g.
(LP SC) who had been identified with semi-invariants. Hald’s book
is a major addition to the literature of the
overlooked for, as well as revealing a largely forgotten Laplace,
it describes many continental European developments that were overlooked in the
Anglo-centric statistical literature of the C20. (see above)
C. C. Heyde
& E. Seneta (1977) I. J. Bienaymé: Statistical Theory
Anticipated, New York:
Lauritzen (2002) Thiele: Pioneer in Statistics, Oxford:
Oxford University Press. (first
A. Hald (1998)
A History of Mathematical Statistics from
1750 to 1930, New York:
are Hacking and von Plato; as well as the books referred to above,
(1990) The Taming of Chance Cambridge: Cambridge
In recent decades the history and sociology of science have flourished. T. S.
Kuhn’s Structure of Scientific Revolutions (1962) had a
strong influence on both fields. Among the works on probability and statistics
by historians and sociologists are
M. Porter (1986) The Rise of Statistical Thinking 1820-1900, Princeton: Princeton University Press. (contents)
(1988) Classical Probability in the Enlightenment. Princeton: Princeton University Press Amazon
Revolution, volume 1 edited by L. Krüger, L. J. Daston and M. Heidelberger,
volume 2 edited by L. Krüger, G. Gigerenzer and M. S. Morgan Cambridge, Mass.:
MIT Press (1987) contents
D. A. MacKenzie (1981) Statistics
1865-1930: The Social Construction of Scientific Knowledge. Edinburgh: Edinburgh
There is a review essay of Daston
and Krüger et al. by MacKenzie in Isis, 80,
(1989), pp. 116-124.
effort has gone into making important texts available
S. M. Stigler and I. M. Cohen American
Contributions to Mathematical Statistics in the Nineteenth Century, contents
(includes work by Adrain,
H. A. David & A. W. F.
Edwards (eds.) (2001) Annotated Readings in
the History of Statistics, New
York: Springer. Amazon
(Its Appendix A lists English translations of works of interest to historians
developments have not attracted the attention of historians yet. Some classic modern
contributions are reprinted (with commentary) in S. Kotz & N. L. Johnson
(Editors) (1993/7) Breakthroughs in
Statistics: Volume I-III New York Springer Amazon.
Statistical Science has been publishing interviews for the past 20 years and
these are a form of living history—there must be 100 by now; the post-1995
issues are available through Euclid.
also publishes interviews and articles on history; among the statisticians
interviewed are T. W. Anderson
and statistics now appear as topics in textbooks on the history of mathematics
and the history of disciplines that use probability and statistics. See for
(1993) A History of Mathematics, New
(2001) Statistics in Psychology: An Historical Perspective, London: Erlbaum
on the history of probability and statistics appear in several journals
for History of Exact Sciences Biometrika British Journal of the History of
Mathematica International Statistical Review Isis Journal
of the History of the Behavioral Sciences Statistical Science.
2005 a specialist online journal was
Journal for History of Probability and Statistics/Journal Electronique
d'Histoire des Probabilités et de la Statistique.
The JEHPS lists Publications in the History of Probability and
Statistics. The current lists cover 2005-9.
History of Statistics has suggestions for further reading and an
Hald’s History of Mathematical Statistics from 1750 to 1930 also has a
very valuable bibliography.
are several online bibliograpies. Two of the bibliographies were compiled in
the mid-90s but are still useful—a brief one by Joyce, restricted to secondary
sources, and an extensive one by Lee
Oscar Sheynin References
Giovanni Favero Storia della Probabilità e della Statistica
of Probability and Statistics
Peter M Lee The History of Statistics: A Select Bibliography
Through the web many of
the important original texts are now easily accessible. The following open
access sites are very useful.
Digital Mathematics Library retrodigitized Mathematics Journals and Monographs
and work of Statisticians
SAO/NASA ADS Astronomy
Gallica particularly good for French literature