Water droplets creating ripples on a dark blue surface, captured in mid-air with a calm, serene background.

Why things happen

Either cause and effect are the very glue of the cosmos, or they are a naive illusion due to insufficient math. But which?

by Mathias Frisch

Photo by Gallery Stock

Do early childhood vaccinations cause autism, as the American model Jenny McCarthy maintains? Are human carbon emissions at the root of global warming? Come to that, if I flick this switch, will it make the light on the porch come on? Presumably I don’t need to persuade you that these would be incredibly useful things to know.

Since anthropogenic greenhouse gas emissions do cause climate change, cutting our emissions would make a difference to future warming. By contrast, autism cannot be prevented by leaving children unvaccinated. Now, there’s a subtlety here. For our judgments to be much use to us, we have to distinguish between causal relations and mere correlations. From 1999 and 2009, the number of people in the US who fell into a swimming pool and drowned varies with the number of films in which Nicholas Cage appeared – but it seems unlikely that we could reduce the number of pool drownings by keeping Cage off the screen, desirable as the remedy might be for other reasons.

In short, a working knowledge of the way in which causes and effects relate to one another seems indispensible to our ability to make our way in the world. Yet there is a long and venerable tradition in philosophy, dating back at least to David Hume in the 18th century, that finds the notions of causality to be dubious. And that might be putting it kindly.

Hume argued that when we seek causal relations, we can never discover the real power; the, as it were, metaphysical glue that binds events together. All we are able to see are regularities – the ‘constant conjunction’ of certain sorts of observation. He concluded from this that any talk of causal powers is illegitimate. Which is not to say that he was ignorant of the central importance of causal reasoning; indeed, he said that it was only by means of such inferences that we can ‘go beyond the evidence of our memory and senses’. Causal reasoning was somehow both indispensable and illegitimate. We appear to have a dilemma.

Hume’s remedy for such metaphysical quandaries was arguably quite sensible, as far as it went: have a good meal, play backgammon with friends, and try to put it out of your mind. But in the late 19th and 20th centuries, his causal anxieties were reinforced by another problem, arguably harder to ignore. According to this new line of thought, causal notions seemed peculiarly out of place in our most fundamental science – physics.

There were two reasons for this. First, causes seemed too vague for a mathematically precise science. If you can’t observe them, how can you measure them? If you can’t measure them, how can you put them in your equations? Second, causality has a definite direction in time: causes have to happen before their effects. Yet the basic laws of physics (as distinct from such higher-level statistical generalisations as the laws of thermodynamics) appear to be time-symmetric: if a certain process is allowed under the basic laws of physics, a video of the same process played backwards will also depict a process that is allowed by the laws.

The 20th-century English philosopher Bertrand Russell concluded from these considerations that, since cause and effect play no fundamental role in physics, they should be removed from the philosophical vocabulary altogether. ‘The law of causality,’ he said with a flourish, ‘like much that passes muster among philosophers, is a relic of a bygone age, surviving, like the monarchy, only because it is erroneously supposed not to do harm.’

Neo-Russellians in the 21st century express their rejection of causes with no less rhetorical vigour. The philosopher of science John Earman of the University of Pittsburgh maintains that the wooliness of causal notions makes them inappropriate for physics: ‘A putative fundamental law of physics must be stated as a mathematical relation without the use of escape clauses or words that require a PhD in philosophy to apply (and two other PhDs to referee the application, and a third referee to break the tie of the inevitable disagreement of the first two).’

This is all very puzzling. Is it OK to think in terms of causes or not? If so, why, given the apparent hostility to causes in the underlying laws? And if not, why does it seem to work so well?

A clearer look at the physics might help us to find our way. Even though (most of) the basic laws are symmetrical in time, there are many arguably non-thermodynamic physical phenomena that can happen only one way. Imagine a stone thrown into a still pond: after the stone breaks the surface, waves spread concentrically from the point of impact. A common enough sight.

Now, imagine a video clip of the spreading waves played backwards. What we would see are concentrically converging waves. For some reason this second process, which is the time-reverse of the first, does not seem to occur in nature. The process of waves spreading from a source looks irreversible. And yet the underlying physical law describing the behaviour of waves – the wave equation – is as time-symmetric as any law in physics. It allows for both diverging and converging waves. So, given that the physical laws equally allow phenomena of both types, why do we frequently observe organised waves diverging from a source but never coherently converging waves?

Physicists and philosophers disagree on the correct answer to this question – which might be fine if it applied only to stones in ponds. But the problem also crops up with electromagnetic waves and the emission of light or radio waves: anywhere, in fact, that we find radiating waves. What to say about it?

On the one hand, many physicists (and some philosophers) invoke a causal principle to explain the asymmetry. Consider an antenna transmitting a radio signal. Since the source causes the signal, and since causes precede their effects, the radio waves diverge from the antenna after it is switched on simply because they are the repercussions of an initial disturbance, namely the switching on of the antenna. Imagine the time-reverse process: a radio wave steadily collapses into an antenna before the latter has been turned on. On the face of it, this conflicts with the idea of causality, because the wave would be present before its cause (the antenna) had done anything. David Griffiths, Emeritus Professor of Physics at Reed College in Oregon and the author of a widely used textbook on classical electrodynamics, favours this explanation, going so far as to call a time-asymmetric principle of causality ‘the most sacred tenet in all of physics’.

On the other hand, some physicists (and many philosophers) reject appeals to causal notions and maintain that the asymmetry ought to be explained statistically. The reason why we find coherently diverging waves but never coherently converging ones, they maintain, is not that wave sources cause waves, but that a converging wave would require the co‑ordinated behaviour of ‘wavelets’ coming in from multiple different directions of space – delicately co‑ordinated behaviour so improbable that it would strike us as nearly miraculous.

It so happens that this wave controversy has quite a distinguished history. In 1909, a few years before Russell’s pointed criticism of the notion of cause, Albert Einstein took part in a published debate concerning the radiation asymmetry. His opponent was the Swiss physicist Walther Ritz, a name you might not recognise.

It is in fact rather tragic that Ritz did not make larger waves in his own career, because his early reputation surpassed Einstein’s. The physicist Hermann Minkowski, who taught both Ritz and Einstein in Zurich, called Einstein a ‘lazy dog’ but had high praise for Ritz. When the University of Zurich was looking to appoint its first professor of theoretical physics in 1909, Ritz was the top candidate for the position. According to one member of the hiring committee, he possessed ‘an exceptional talent, bordering on genius’. But he suffered from tuberculosis, and so, due to his failing health, he was passed over for the position, which went to Einstein instead. Ritz died that very year at age 31.

Months before his death, however, Ritz published a joint letter with Einstein summarising their disagreement. While Einstein thought that the irreversibility of radiation processes could be explained probabilistically, Ritz proposed what amounted to a causal explanation. He maintained that the reason for the asymmetry is that an elementary source of radiation has an influence on other sources in the future and not in the past.

If two lamps go out, it is unlikely that both bulbs just happened to burn out simultaneously: we look for a common cause

This joint letter is something of a classic text, widely cited in the literature. What is less well-known is that, in the very same year, Einstein demonstrated a striking reversibility of his own. In a second published letter, he appears to take a position very close to Ritz’s – the very view he had dismissed just months earlier. According to the wave theory of light, Einstein now asserted, a wave source ‘produces a spherical wave that propagates outward. The inverse process does not exist as elementary process’. The only way in which converging waves can be produced, Einstein claimed, was by combining a very large number of coherently operating sources. He appears to have changed his mind.

Given Einstein’s titanic reputation, you might think that such a momentous shift would occasion a few ripples in the history of science. But I know of only one significant reference to his later statement: a letter from the philosopher Karl Popper to the journal Nature in 1956. In this letter, Popper describes the wave asymmetry in terms very similar to Einstein’s. And he also makes one particularly interesting remark, one that might help us to unpick the riddle. Coherently converging waves, Popper insisted, ‘would demand a vast number of distant coherent generators of waves the co‑ordination of which, to be explicable, would have to be shown as originating from the centre’ (my italics).

This is, in fact, a particular instance of a much broader phenomenon. Consider two events that are spatially distant yet correlated with one another. If they are not related as cause and effect, they tend to be joint effects of a common cause. If, for example, two lamps in a room go out suddenly, it is unlikely that both bulbs just happened to burn out simultaneously. So we look for a common cause – perhaps a circuit breaker that tripped.

Common-cause inferences are so pervasive that it is difficult to imagine what we could know about the world beyond our immediate surroundings without them. Hume was right: judgments about causality are absolutely essential in going ‘beyond the evidence of the senses’. In his book The Direction of Time (1956), the philosopher Hans Reichenbach formulated a principle underlying such inferences: ‘If an improbable coincidence has occurred, there must exist a common cause.’ To the extent that we are bound to apply Reichenbach’s rule, we are all like the hard-boiled detective who doesn’t believe in coincidences.

This gives us a hint at the power of causal inferences: they require only very limited, local knowledge of the world as input. Nevertheless, causal skeptics have argued that such inferences are superfluous in physics, which is supposed to proceed in a very different way. In this rather majestic vision of scientific inference, we simply feed the laws a description of the complete state of a system at one time, and then they ‘spit out’ the state of the system at any other time. The laws are a kind of smoothly humming engine, generating inferences from one time to another – and given this magnificent machine, the skeptics claim, causal principles are practically irrelevant.

How do we know that the points of light in the night sky are stars?

It’s an appealing idea. However, a moment’s reflection tells us that very few investigations could actually proceed in this manner. For one thing, we rarely (if ever) have access to the complete initial data required for the laws to deliver an unequivocal answer. Suppose we wanted to calculate the state of the world just one second from now. If the laws are relativistic – that is, if they stipulate that no influence can travel faster than light – our initial state description would need to cover a radius of 300,000 km. Only then could we account for any possible influences that might reach our location within one second. For all practical purposes this is, of course, impossible. And so we find that, even in physics, we need inferences that require much less than complete states as input.

Astronomical observations provide a particularly stark example. How do we know that the points of light in the night sky are stars? The approach using laws and initial (or, in this case, final) conditions to calculate backward in time to the existence of the star would require data on the surface of an enormous sphere of possibly many light years in diameter. Stuck here on Earth as we are, that just isn’t going to happen. So what do we do? Well, we can make use of the fact that we observe points of light at the same celestial latitude and longitude at different moments in time, or at different spatial locations, and that these light points are highly correlated with one another. (These correlations can, for example, be exploited in stellar interferometry.) From these correlations we can infer the existence of the star as common cause of our observations. Causal inference may be superfluous in some idealised, superhuman version of physics, but if you actually want to find out how the Universe works, it is vital.

It can sometimes seem as if the debate over wave asymmetry hasn’t advanced much since 1909. And yet, doesn’t the comparison with other common-cause inferences show that Ritz and then later Einstein were right and the earlier Einstein was wrong? Indeed, if we take Popper’s remark seriously, it seems as if the probabilistic explanation itself relies on implicit causal assumptions. Let’s think again about a wave coherently diverging from a source compared with a wave coherently converging into a source.

Both scenarios involve ‘delicately set up’ correlations among different parts of the wave; after all, each of the two processes is simply the other one run backwards in time. But then, contrast our familiar experience with that of the narrator in Martin Amis’ time-flipped novel Time’s Arrow (1991), who takes a boat journey across the Atlantic:

John is invariably to be found on the stern, looking at where we’re headed. The ship’s route is clearly delineated on the surface of the water and is violently consumed by our advance. Thus we leave no mark on the ocean, as if we are successfully covering our tracks.

That the ship’s wake pattern should be laid out before the ship, so that it is made to disappear as the ship advances, seems miraculous and all but impossible. And yet the correlations are the very same ones that exist between a ship and its familiar wake-pattern in the real world. Why on earth should that be? Why does a wave coherently converging into a source strike us as miraculous, while a wave coherently diverging from a source is completely ordinary?

The answer must be that, in the case of the diverging wave, there is an obvious explanation for the ‘delicate’ correlations: the source acts as common cause. This is in sharp contrast with a converging wave, for which the correlations cannot be explained by appealing to the source into which the wave converges. Since the two processes are the time-reverse of each other, the only possible difference between the two cases, it seems, concerns their different causal structures. I think this answer is essentially correct. And so, as far as it goes, perhaps we can declare victory for Ritz.

However, victory might prove rather hollow. Formal advances in causal modelling in the past two decades suggest that the difference between the two explanatory strategies – causal and probabilistic – is much smaller than it first appears. As the computer scientist Judea Pearl at the University of California, Los Angeles and others have shown, causal structures can in fact be represented with mathematical precision. This answers Earman’s vagueness worry: one PhD is more than enough to be able to apply them coherently, and it might even help if the degree is not in philosophy.

We face a chicken-and-egg dilemma

More importantly, it turns out that the causal asymmetry of common-cause structures and the assumption of probabilistic independence are really two sides of the same coin. More precisely, common-cause inferences need the initial inputs to the system to be probabilistically independent of one another. This makes intuitive sense: if the inputs to your model are correlated, downstream relationships between variables could be due to matches that were present from the beginning, rather than due to anything that happened inside the model. So common-cause inferences depend on an assumption of independence. And from this perspective it might seem that the early Einstein was correct after all: probability comes first.

But not so fast! As we saw above, the explanatory direction can be reversed so that the assumption of probabilistic independence is taken to reflect a causal assumption about the system. And this, again, seems to indicate that Ritz was right. We face a chicken-and-egg dilemma. In fact there might not be a uniquely correct answer to the question which of the two assumptions is logically prior.

This opens up a third interpretive option. Why not see both the probabilistic independence assumption and the common-cause principle as mutually dependent aspects of causal structures? We can accept that these structures have an important role to play in physics, just as they do in other sciences and in common sense, without having to commit to the metaphysical priority of either.

This third view is reminiscent of the late US physicist Richard Feynman’s view about physical laws. Feynman argued that the laws of physics do not exhibit a unique, logical structure, such that one set of statements is more fundamental than another. Instead of a hierarchical ‘Euclidean conception’ of theories, Feynman argued that physics follows what he calls the ‘Babylonian tradition’, according to which the principles of physics provide us with an interconnected structure with no unique, context-independent starting point for our derivations. Given such structures, Feynman said: ‘I am never quite sure of where I am supposed to begin or where I am supposed to end.’

I want to suggest that we should think of causal structures in physics in the very same way. Contrary to Russellian skeptics, causal structures play as indispensible a role in physics as in other sciences. And yet we do not need to take sides in the debate between Einstein and Ritz. Derivation doesn’t have to start anywhere in particular. Rather, we can understand the probabilistic independence assumption and the causal asymmetry as two interrelated aspects of causal structures.

This final view, however, presupposes that we agree with Hume that our use of causal reasoning is not underwritten by any kind of metaphysical ‘glue’. As Hume taught us, causal representations are incredibly useful – we couldn’t get very far in the world without them, in physics or elsewhere. But this doesn’t mean we must believe in a richly metaphysical idea of causal powers, ‘producing’ or ‘bringing about’ causal regularities like muscular enforcers of the laws of nature. We still see only the patterns, the constant conjunctions of different sorts of event.

It’s a vaguely unsettling thought, isn’t it? But that might be enough.

Logic and probability Philosophy of science Physics

23 June 2015

SYNDICATE THIS ESSAY