In Isaac Asimov’s novel Foundation (1951), the mathematician Hari Seldon forecasts the collapse of the Galactic Empire using psychohistory: a calculus of the patterns that occur in the reaction of the mass of humanity to social and economic events. Initially put on trial for treason, on the grounds that his prediction encourages said collapse, Seldon is permitted to set up a research group on a secluded planet. There, he investigates how to minimise the destruction and reduce the subsequent period of anarchy from 30,000 years to a mere 1,000.
Asimov knew that predicting large-scale political events over periods of millennia is not really plausible. But we all do suspend this disbelief when reading fiction. No Jane Austen fan gets upset to be told that Elizabeth Bennet and Mr Darcy didn’t actually exist. Asimov was smart enough to know that such forecasting, however accurate it might be, is vulnerable to any large disturbance that hasn’t been anticipated, not even in principle. He also understood that readers who happily swallowed psychohistory would realise the same thing. In the second volume of the series, just such a ‘black swan’ event derails Seldon’s plans. However, Seldon has a contingency plan, one that the series later reveals also brings some surprises.
Asimov’s Foundation series is notable for concentrating on the political machinations of the key groups, instead of churning out page upon page of space battles between vast fleets armed to the teeth. The protagonists receive regular reports of such battles, but the description is far from a Hollywood treatment. The plot, as Asimov himself stated, is modelled on Edward Gibbon’s book The History of the Decline and Fall of the Roman Empire (1776-89), and a masterclass in planning on an epic scale for uncertainty. Every senior minister and civil servant should be obliged to read it.
Psychohistory, a fictional method for predicting humanity’s future, takes a hypothetical mathematical technique to extremes, for dramatic effect. But, for less ambitious tasks, we use the basic idea every day; for example, when a supermarket manager estimates how many bags of flour to put on the shelves, or an architect assesses the likely size of a meeting room when designing a building. The character of Seldon was to some extent inspired by Adolphe Quételet, one of the first to apply mathematics to human behaviour. Quételet was born in 1796 in Ghent in the Low Countries, now Belgium. Today’s obsessions with the promises and dangers of ‘big data’ and artificial intelligence are direct descendants of Quételet’s brainchild. He didn’t call it psychohistory, of course. He called it social physics.
The basic tools and techniques of statistics were born in the physical sciences, especially astronomy. They originated in a systematic method to extract information from observations subject to unavoidable errors. As the understanding of probability theory grew, a few pioneers extended the method beyond its original boundaries. Statistics became indispensable in biology, medicine, government, the humanities, even sometimes the arts. So it’s fitting that the person who lit the fuse was a pure mathematician turned astronomer, one who succumbed to the siren song of the social sciences.
Quételet bequeathed to posterity the realisation that, despite all the vagaries of free will and circumstance, the behaviour of humanity in bulk is far more predictable than we like to imagine. Not perfectly, by any means, but, as they say, ‘good enough for government work’. He also left us two specific ideas: l’homme moyen, the ‘average man’, and the ubiquity of the normal probability distribution, better-known as the bell curve. Both are useful tools that opened up new ways of thinking, and that have serious flaws if taken too literally or applied too widely.
Quételet gained the first doctorate awarded by the newly founded University of Ghent. His thesis was on conic sections, a topic that also fascinated Ancient Greek geometers, who constructed important curves – ellipse, parabola, hyperbola – by slicing a cone with a plane. For a time, he taught mathematics, until his election to the Royal Academy of Brussels in 1820 propelled him into a 50-year career in the scholarly stratosphere as the central figure of Belgian science.
Around that time, Quételet joined a movement to found a new observatory. He didn’t know much astronomy, but he was a born entrepreneur and he knew his way around the labyrinths of government. His first step was to secure a promise of government funding. Then he took measures to remedy his ignorance of the subject that the observatory was to study. In 1823, at government expense, he headed for Paris to study with leading astronomers, meteorologists and mathematicians. He learned astronomy and meteorology from François Arago and Alexis Bouvard, and probability theory from Joseph Fourier.
One basic number has a strong effect on everything that happens, and will happen, in a country: its population
At that time, astronomers were pioneering the use of probability theory to improve measurements of planetary orbits despite inevitable observational errors. Learning these techniques from the experts sparked a lifelong obsession with the application of probability to statistical data. By 1826, Quételet was a regional correspondent for the statistical bureau of the Kingdom of the Low Countries.
One basic number has a strong effect on everything that happens, and will happen, in a country: its population. If you don’t know how many people you’ve got, it’s difficult to plan. You can guesstimate, but you might well end up wasting a lot of money on unnecessary infrastructure, or underestimating demand and causing a crisis. This is a problem that every nation still grapples with.
The natural way to find out how many people live in your country is to count them. Making a census isn’t as easy as it might seem, however. People move around, and they hide themselves away to avoid being convicted of crimes or to avoid paying tax. In 1829, the Belgian government was planning a new census and Quételet, who had been working on historical population figures, joined the project. ‘The data that we have at present can only be considered provisional, and are in need of correction,’ he wrote. A full census is expensive, so it makes sense to estimate population changes between censuses. However, you can’t get away with estimates for long, and a census every 10 years is common. Quételet urged the government to carry out a new census, to get an accurate baseline for future estimates. However, he’d come back from Paris with an interesting idea, an idea, he’d got from the great French mathematician Pierre-Simon de Laplace. If it worked, it would save a lot of money.
Laplace had calculated the population of France by multiplying together two numbers. The first was the number of births in the past year. It could be found from the registers of births, which were pretty accurate. The other number was the ratio of the total population to the annual number of births – the reciprocal of the birth rate. Multiplying the number of births and the ratio of the population gives the total population change. However, to work, it looks as though you need to know the total population to find the birth rate. Laplace’s idea was to sample: you could get a reasonable estimate using sound sampling methods. Select a few reasonably typical areas, perform a full census in those, compare with the number of births in those areas. Laplace calculated that about 30 areas would be adequate to estimate the population of the whole of France.
The Belgian government, however, eschewed sampling and carried out a full census. Quételet seems to have rejected the soundness of sampling due to an intelligent, informed but misguided methodological criticism by Baron de Keverberg, an advisor to the state. Observing that birth rates in different regions depend on a bewildering variety of factors, the Baron concluded that it would be impossible to create a representative sample. Errors would accumulate, making the results useless. But he made two mistakes. One was to seek a representative sample, rather than settling for a random one. The other was to assume the worst case (sampling errors accumulate) rather than the typical case (errors mostly cancel out each other through random variation). Notably, Laplace had also assumed that the best way to sample a population was to select, in advance, regions considered to be in some sense representative of the whole, with a similar mix of rich and poor, educated and uneducated, male and female. Today, opinion polls often remain so designed, in an effort to get good results from small samples. However, statisticians eventually discovered that a big-enough random sample is just as effective as one specially selected to be representative, and much simpler to obtain. But all this was in the future, and Belgium duly tried to count every single person.
Baron de Keverberg’s criticism of Quételet’s plans for the 1829 Belgian census had one useful effect: it encouraged Quételet to collect vast amounts of data and analyse them to death. Quételet soon branched out from counting people to measuring people. For eight years, he collected data on birth rates, death rates, marriage, date of conception, height, weight, strength, growth rate, drunkenness, insanity, suicide, crime. He enquired into their variation with age, sex, profession, location, time of year, being in prison, being in hospital. He always compared just two factors at a time, which let him draw graphs to illustrate relationships. He published his conclusions in Sur l’homme et le développement de ses facultés: ou, Essai de physique sociale (1835), which appeared in English as Treatise on Man and the Development of His Faculties (1842).
Whenever Quételet referred to the book, he used the subtitle ‘physique sociale’. In 1869, for a new edition, he swapped his former title and subtitle, making Social Physics the main title. He knew what he had created: a mathematical analysis of what it is to be human. Or, at least, those features of humanity that can be quantified.
The concept in the book that caught the public imagination, and never let go, is that of the average man. Inasmuch as his concept made any sense at all, Quételet was well aware that it was also necessary to consider the average woman, average child and, for various populations, different instances of all of these. He noticed early on that his data for attributes such as height or weight (duly restricted to a single gender and age group) tended to cluster around a single value. If we draw the data as a bar-chart, the tallest bar is in the middle, while the others slope away from it on either side. It presents the characteristic shape of a bell curve, already known to mathematicians, as Quételet acknowledged. The whole shape is roughly symmetrical, so the central peak – representing commonest value – is also the average value. Many types of data show this pattern, and it was Quételet who realised its importance in social science.
Tables and graphs are all very well, but Quételet wanted a snappy summary, one that conveyed the main point in a vivid, memorable manner. So instead of saying ‘the average value of the bell curve for the heights of some class of human males over 20 years of age is 1.74 metres’, he preferred: ‘the average man (in that class) is 1.74 metres tall’. He could then compare average men across different populations. How does the average Belgian infantryman stack up to the average French farmer? Is ‘he’ shorter, taller, lighter, heavier, or much the same? How does ‘he’ compare with the average German military officer? How does the average man in Brussels compare with his counterpart in London? What about the average woman? Average child? Which country’s average man is more likely to be a murderer or a victim? Or be a doctor, devoted to saving lives, rather than a suicide, intent on ending his own? A different average man (or woman or child) is needed for each attribute. As Stephen Stigler put it in The History of Statistics (1986), Quételet considered that ‘the average man was a device for smoothing away the random variations of society and revealing the regularities that were to be the laws of his “social physics”.’
For Galton, Quételet’s average man was a social imperative, and one to be avoided
After 1880, the social sciences began to make extensive use of statistics, especially the bell curve. Francis Galton was a pioneer of data analysis in weather forecasting, and discovered the existence of anticyclones. Galton produced the first weather map, published in The Times in 1875, and he was fascinated by real-world numerical data and the mathematical patterns hidden within them. When Charles Darwin published On the Origin of Species (1859), Galton began a study of human heredity. How does the height of a child relate to that of the parents? What about weight, or intellectual ability? Galton adopted Quételet’s bell curve, using it to separate distinct populations. If data showed two peaks, rather than the single peak of the bell curve, Galton argued that the population concerned must be composed of two distinct sub-populations, each following its own bell curve.
Galton grew convinced that desirable human traits are hereditary, a deduction from evolutionary theory but one that Darwin repudiated. For Galton, Quételet’s average man was a social imperative, and one to be avoided. His book Hereditary Genius (1869) invoked statistics to study the inheritance of genius and greatness, with what today appears a curious mixture of egalitarian aims (‘every lad [should have] a chance of showing his abilities, and, if highly gifted, enabled to achieve a first-class education and entrance into professional life’) and the encouragement of ‘the pride of race’. In his Inquiries into Human Faculty and its Development (1883), Galton coined the term ‘eugenics’, advocating financial rewards to encourage marriage between families of high rank or intellect. He wanted to breed people with allegedly superior abilities. Eugenics had its day in the 1920s and ’30s, but rapidly fell from grace because of widespread abuses, the forced sterilisation of mental patients, and the Nazi delusion of a master race, for example. Today, eugenics is considered racist. It contravenes the United Nations Convention on the Prevention and Punishment of the Crime of Genocide and the European Union’s Charter of Fundamental Rights. However, the idea has never completely gone away.
Whatever we might think of Galton’s character, his contribution to statistics is undeniable. By 1877, he had invented regression analysis, which calculates the most probable relationship between different quantities. This led to another central concept in statistics: correlation, which assesses the degree of relationship between sets of data – for example, level of smoking and incidence of lung cancer. Galton discussed examples, such as the relation between forearm length and height, in 1888. The English mathematician and biostatistician Karl Pearson then turned the idea into a mathematical formula, the correlation coefficient. As often pointed out, correlation is not causality – but it’s often a useful indicator of potential causality.
In 1824, the Aru Pennsylvanian conducted a straw poll to get some idea whether Andrew Jackson or John Quincy Adams would be elected president of the United States. The poll showed 335 votes for Jackson and 169 for Adams. Jackson won the popular (and electoral) vote. Ever since, elections have attracted opinion pollsters. For practical reasons, polls sample only a small section of voters. The obvious mathematical question is: how big should the sample be to give accurate results? The same question is important in census-taking, medical trials of a new drug and many other endeavours.
Until recently, pollsters mostly used random samples. The Law of Large Numbers, discovered by Jacob Bernoulli around 1684 and published in his epic Ars Conjectandi (1713), or ‘The Art of Conjecture’, tells us that, if the sample is large enough, the average value of that sample is ‘almost surely’ as close as we wish to the true average. But this doesn’t tell us how big ‘large enough’ should be. A more sophisticated result, the Central Limit Theorem, uses a bell curve to relate the sample mean to the actual mean, and to calculate the smallest sample size that should work.
Social media have changed the way that many polls are carried out. Well-designed internet polls have reverted to Laplace’s approach. They use a representative panel of carefully selected individuals. But many polls just let anyone who wants to vote do so – neither random nor representative. These polls are poorly designed, because people with strong views are more likely to vote, many people don’t even know about the poll, and some might not have an internet connection. Telephone polls are also likely to be biased because many people don’t answer cold-callers, or refuse to respond to a pollster when asked for an opinion. In this scam-ridden era, they might not even be sure that the call is a genuine poll. Some people don’t have a phone. Some don’t tell the pollster their true intentions – for example, they might not be willing to tell a stranger that they plan to vote for an extremist party. Even the way a question is worded can affect how people respond.
Patterns of human behaviour can be extracted from records of credit-card purchases, telephone calls and emails
Polling organisations use a variety of methods to try to minimise these sources of error. Many of these methods are mathematical, but psychological and other factors also come into consideration. Most of us know of stories where polls have confidently indicated the wrong result, and it seems to be happening more often. Special factors are sometimes invoked to ‘explain’ why, such as a sudden late swing in opinion, or people deliberately lying to make the opposition think it’s going to win and become complacent. Nevertheless, when performed competently, polling has a fairly good track-record overall. It provides a useful tool for reducing uncertainty. Exit polls, where people are asked whom they voted for soon after they cast their vote, are often very accurate, giving the correct result long before the official vote count reveals it, and can’t influence the result.
Today, the term ‘social physics’ has acquired a less metaphorical meaning. Rapid progress in information technology has led to the ‘big data’ revolution, in which gigantic quantities of information can be obtained and processed. Patterns of human behaviour can be extracted from records of credit-card purchases, telephone calls and emails. Words suddenly becoming more common on social media, such as ‘demagogue’ during the 2016 US presidential election, can be clues to hot political issues.
The mathematical challenge is to find effective ways to extract meaningful patterns from masses of unstructured information, and many new methods are being brought to bear – including some that originated in physics itself. For example, theories of how gas molecules bounce off each other have been adapted to predict how crowds of people move in large buildings or complexes such as an Olympic park. The social and political challenges are to ensure that such methods are not abused. With the growing introduction of powerful new methods, social physics has come a long way since Quételet first wondered how to find out how many people lived in Belgium, without actually counting them.
This is an adapted excerpt from the book ‘Do Dice Play God? The Mathematics of Uncertainty’ (2019) by Ian Stewart, published by Basic Books in September 2019.