Abstract digital art with geometric shapes in various colours, forming the shape of a person’s head and shoulders, featuring black glasses in the centre.

The incompleteness of ethics

Many hope that AI will discover ethical truths. But as Gödel shows, deciding what is right will always be our burden

by Elad Uzan

Illustration by Andy Goodman/Making Pictures

Imagine a world in which artificial intelligence is entrusted with the highest moral responsibilities: sentencing criminals, allocating medical resources, and even mediating conflicts between nations. This might seem like the pinnacle of human progress: an entity unburdened by emotion, prejudice or inconsistency, making ethical decisions with impeccable precision. Unlike human judges or policymakers, a machine would not be swayed by personal interests or lapses in reasoning. It does not lie. It does not accept bribes or pleas. It does not weep over hard decisions.

Yet beneath this vision of an idealised moral arbiter lies a fundamental question: can a machine understand morality as humans do, or is it confined to a simulacrum of ethical reasoning? AI might replicate human decisions without improving on them, carrying forward the same biases, blind spots and cultural distortions from human moral judgment. In trying to emulate us, it might only reproduce our limitations, not transcend them. But there is a deeper concern. Moral judgment draws on intuition, historical awareness and context – qualities that resist formalisation. Ethics may be so embedded in lived experience that any attempt to encode it into formal structures risks flattening its most essential features. If so, AI would not merely reflect human shortcomings; it would strip morality of the very depth that makes ethical reflection possible in the first place.

Still, many have tried to formalise ethics, by treating certain moral claims not as conclusions, but as starting points. A classic example comes from utilitarianism, which often takes as a foundational axiom the principle that one should act to maximise overall wellbeing. From this, more specific principles can be derived, for example, that it is right to benefit the greatest number, or that actions should be judged by their consequences for total happiness. As computational resources increase, AI becomes increasingly well-suited to the task of starting from fixed ethical assumptions and reasoning through their implications in complex situations.

But what, exactly, does it mean to formalise something like ethics? The question is easier to grasp by looking at fields in which formal systems have long played a central role. Physics, for instance, has relied on formalisation for centuries. There is no single physical theory that explains everything. Instead, we have many physical theories, each designed to describe specific aspects of the Universe: from the behaviour of quarks and electrons to the motion of galaxies. These theories often diverge. Aristotelian physics, for instance, explained falling objects in terms of natural motion toward Earth’s centre; Newtonian mechanics replaced this with a universal force of gravity. These explanations are not just different; they are incompatible. Yet both share a common structure: they begin with basic postulates – assumptions about motion, force or mass – and derive increasingly complex consequences. Isaac Newton’s laws of motion and James Clerk Maxwell’s equations are classic examples: compact, elegant formulations from which wide-ranging predictions about the physical world can be deduced.

Ethical theories have a similar structure. Like physical theories, they attempt to describe a domain – in this case, the moral landscape. They aim to answer questions about which actions are right or wrong, and why. These theories also diverge and, even when they recommend similar actions, such as giving to charity, they justify them in different ways. Ethical theories also often begin with a small set of foundational principles or claims, from which they reason about more complex moral problems. A consequentialist begins with the idea that actions should maximise wellbeing; a deontologist starts from the idea that actions must respect duties or rights. These basic commitments function similarly to their counterparts in physics: they define the structure of moral reasoning within each ethical theory.

Just as AI is used in physics to operate within existing theories – for example, to optimise experimental designs or predict the behaviour of complex systems – it can also be used in ethics to extend moral reasoning within a given framework. In physics, AI typically operates within established models rather than proposing new physical laws or conceptual frameworks. It may calculate how multiple forces interact and predict their combined effect on a physical system. Similarly, in ethics, AI does not generate new moral principles but applies existing ones to novel and often intricate situations. It may weigh competing values – fairness, harm minimisation, justice – and assess their combined implications for what action is morally best. The result is not a new moral system, but a deepened application of an existing one, shaped by the same kind of formal reasoning that underlies scientific modelling. But is there an inherent limit to what AI can know about morality? Could there be true ethical propositions that no machine, no matter how advanced, can ever prove?

These questions echo a fundamental discovery in mathematical logic, probably the most fundamental insight ever to be proven: Kurt Gödel’s incompleteness theorems. They show that any logical system powerful enough to describe arithmetic is either inconsistent or incomplete. In this essay, I argue that this limitation, though mathematical in origin, has deep consequences for ethics, and for how we design AI systems to reason morally.

Suppose we design an AI system to model moral decision-making. Like other AI systems – whether predicting stock prices, navigating roads or curating content – it would be programmed to maximise certain predefined objectives. To do so, it must rely on formal, computational logic: either deductive reasoning, which derives conclusions from fixed rules and axioms, or else on probabilistic reasoning, which estimates likelihoods based on patterns in data. In either case, the AI must adopt a mathematical structure for moral evaluation. But Gödel’s incompleteness theorems reveal a fundamental limitation. Gödel showed that any formal system powerful enough to express arithmetic, such as the natural numbers and their operations, cannot be both complete and consistent. If such a system is consistent, there will always be true statements it cannot prove. In particular, as applied to AI, this suggests that any system capable of rich moral reasoning will inevitably have moral blind spots: ethical truths that it cannot derive. Here, ‘true’ refers to truth in the standard interpretation of arithmetic, such as the claim that ‘2 + 2 = 4’, which is true under ordinary mathematical rules. If the system is inconsistent, then it could prove anything at all, including contradictions, rendering it useless as a guide for ethical decisions.

Gödel’s incompleteness theorems apply not only to AI, but to any ethical reasoning framed within a formal system. The key difference is that human reasoners can, at least in principle, revise their assumptions, adopt new principles, and rethink the framework itself. AI, by contrast, remains bound by the formal structures it is given, or operates within those it can modify only under predefined constraints. In this way, Gödel’s theorems place a logical boundary on what AI, if built on formal systems, can ever fully prove or validate about morality from within those systems.

Most of us first met axioms in school, usually through geometry. One famous example is the parallel postulate, which says that if you pick a point not on a line, you can draw exactly one line through that point that is parallel to the original line. For more than 2,000 years, this seemed self-evident. Yet in the 19th century, mathematicians such as Carl Friedrich Gauss, Nikolai Lobachevsky and János Bolyai showed that it is possible to construct internally consistent geometries in which the parallel postulate does not hold. In some such geometries, no parallel lines exist; in others, infinitely many do. These non-Euclidean geometries shattered the belief that Euclid’s axioms uniquely described space.

There will always be true but unprovable statements, most notably, the system’s own claim of consistency

This discovery raised a deeper worry. If the parallel postulate, long considered self-evident, could be discarded, what about the axioms of arithmetic, which define the natural numbers and the operations of addition and multiplication? On what grounds can we trust that they are free from hidden inconsistencies? Yet with this challenge came a promise. If we could prove that the axioms of arithmetic are consistent, then it would be possible to expand them to develop a consistent set of richer axioms that define the integers, the rational numbers, the real numbers, the complex numbers, and beyond. As the 19th-century mathematician Leopold Kronecker put it: ‘God created the natural numbers; all else is the work of man.’ Proving the consistency of arithmetic would prove the consistency of many important fields of mathematics.

The method for proving the consistency of arithmetic was proposed by the mathematician David Hilbert. His approach involved two steps. First, Hilbert argued that, to prove the consistency of a formal system, it must be possible to formulate, within the system’s own symbolic language, a claim equivalent to ‘This system is consistent,’ and then prove that claim using only the system’s own rules of inference. The proof should rely on nothing outside the system, not even the presumed ‘self-evidence’ of its axioms. Second, Hilbert advocated grounding arithmetic in something even more fundamental. This task was undertaken by Bertrand Russell and Alfred North Whitehead in their monumental Principia Mathematica (1910-13). Working in the domain of symbolic logic, a field concerned not with numbers, but with abstract propositions like ‘if x, then y’, they showed that the axioms of arithmetic could be derived as theorems from a smaller set of logical axioms. This left one final challenge: could this set of axioms of symbolic logic, on which arithmetic can be built, prove its own consistency? If it could, Hilbert’s dream would be fulfilled. That hope became the guiding ambition of early 20th-century mathematics.

It was within this climate of optimism that Kurt Gödel, a young Austrian logician, introduced a result that would dismantle Hilbert’s vision. In 1931, Gödel published his incompleteness theorems, showing that the very idea of such a fully self-sufficient mathematical system is impossible. Specifically, Gödel showed that if a formal system meets several conditions, it will contain true claims that it cannot prove. It must be complex enough to express arithmetic, include the principle of induction (which allows it to prove general statements by showing they hold for a base case and each successive step), be consistent, and have a decidable set of axioms (meaning it is possible to determine, for any given statement, whether it qualifies as an axiom). Any system that satisfies these conditions, such as the set of logical axioms developed by Russell and Whitehead in Principia Mathematica, will necessarily be incomplete: there will always be statements that are expressible within the system but unprovable from its axioms. Even more strikingly, Gödel showed that such a system can express, but not prove, the claim that it itself is consistent.

Gödel’s proof, which I simplify here, relies on two key insights that follow from his arithmetisation of syntax, the powerful idea of associating any sentence of a formal system with a particular natural number, known as its Gödel number. First, any system complex enough to express arithmetic and induction must allow for formulas with free variables, formulas like S(x): ‘x = 10’, whose truth value depends on the value of x. S(x) is true when x is, in fact, 10, and false otherwise. Since every statement in the system has a unique Gödel number, G(S), a formula can refer to its own Gödel number. Specifically, the system can form statements such as S(G(S)): ‘G(S) = 10’, whose truth depends on whether S(x)’s own Gödel number equals 10. Second, in any logical system, a proof of a formula S has a certain structure: starting with axioms, applying inference rules to produce new formulas from those axioms, ultimately deriving S itself. Just like every formula S has a Gödel number G(S), so every proof of S is assigned a Gödel number, by treating the entire sequence of formulas in the proof as one long formula. So we can define a proof relation P(x, y), where P(x, y) holds if and only if x is the Gödel number of a proof of S, and y is the Gödel number of S itself. The claim that x encodes a proof of S becomes a statement within the system, namely, P(x, y).

Third, building on these ideas, Gödel showed that any formal system capable of expressing arithmetic and the principle of induction can also formulate statements about its own proofs. For example, the system can express statements like: ‘n is not the Gödel number of a proof of formula S’. From this, it can go a step further and express the claim: ‘There is no number n such that n is the Gödel number of a proof of formula S.’ In other words, the system can say that a certain formula S is unprovable within the system. Fourth, Gödel ingeniously constructed a self-referential formula, P, that asserts: ‘There is no number n such that n is the Gödel number of a proof of formula P.’ That is, P says of itself, ‘P is not provable.’ In this way, P is a formal statement that expresses its own unprovability from within the system.

It immediately follows that if the formula P were provable within the system, then it would be false, because it asserts that it has no proof. This would mean the system proves a falsehood, and therefore is inconsistent. So if the system is consistent, then P cannot be proved, and therefore P is indeed unprovable. This leads to the conclusion that, in any consistent formal system rich enough to express arithmetic and induction, there will always be true but unprovable statements, most notably, the system’s own claim of consistency.

The implications of Gödel’s theorems were both profound and unsettling. They shattered Hilbert’s hope that mathematics could be reduced to a complete, mechanical system of derivation and exposed the inherent limits of formal reasoning. Initially, Gödel’s findings faced resistance, with some mathematicians arguing that his results were less general than they appeared. Yet, as subsequent mathematicians and logicians, most notably John von Neumann, confirmed both their correctness and broad applicability, Gödel’s theorems came to be widely recognised as one of the most significant discoveries in the foundations of mathematics.

Gödel’s results have also initiated philosophical debates. The mathematician and physicist Roger Penrose, for example, has argued that they point to a fundamental difference between human cognition and formal algorithmic reasoning. He claims that human consciousness enables us to perceive certain truths – such as those Gödel showed to be unprovable within formal systems – in ways that no algorithmic process can replicate. This suggests, for Penrose, that certain aspects of consciousness may lie beyond the reach of computation. His conclusion parallels that of John Searle’s ‘Chinese Room’ argument, which holds that this is so because algorithms manipulate symbols purely syntactically, without any grasp of their semantic content. Still, the conclusions drawn by Penrose and Searle do not directly follow from Gödel’s theorems. Gödel’s results apply strictly to formal mathematical systems and do not make claims about consciousness or cognition. Whether human minds can recognise unprovable truths as true, or whether machines could ever possess minds capable of such recognition, remains an open philosophical question.

Morality is not just about doing what is right, but understanding why it is right

However, Gödel’s incompleteness theorems do reveal a deep limitation of algorithmic reasoning, in particular AI, one that concerns not just computation, but moral reasoning itself. Without his theorems, it was at least conceivable that an AI could formalise all moral truths and, in addition, prove them from a consistent set of axioms. But Gödel’s work shows that this is impossible. No AI, no matter how sophisticated, could prove all moral truths it can express. The gap between truth claims and provability sets a fundamental boundary on how far formal moral reasoning can go, even for the most powerful machines.

This raises two distinct problems for ethics. The first is an ancient one. As Plato suggests in the Euthyphro, morality is not just about doing what is right, but understanding why it is right. Ethical action requires justification, an account grounded in reason. This ideal of rational moral justification has animated much of our ethical thought, but Gödel’s theorems suggest that, if moral reasoning is formalised, then there will be moral truths that cannot be proven within those systems. In this way, Gödel did not only undermine Hilbert’s vision of proving mathematics consistent; he may also have shaken Plato’s hope of fully grounding ethics in reason.

The second problem is more practical. Even a high-performing AI may encounter situations in which it cannot justify or explain its recommendations using only the ethical framework it has been given. The concern is not just that AI might act unethically but also that it could not demonstrate that its actions are ethical. This becomes especially urgent when AI is used to guide or justify decisions made by humans. Even a high-performing AI will encounter a boundary beyond which it cannot justify or explain its decisions using only the resources of its own framework. No matter how advanced it becomes, there will be ethical truths it can express, but never prove.

The development of modern AI has generally split into two approaches: logic-based AI, which derives knowledge through strict deduction, and large language models (LLMs), which predict meaning from statistical patterns. Both approaches rely on mathematical structures. Formal logic is based on symbolic manipulation and set theory. LLMs are not strictly deductive-logic-based but rather use a combination of statistical inference, pattern recognition, and computational techniques to generate responses.

Just as axioms provide a foundation for mathematical reasoning, LLMs rely on statistical relationships in data to approximate logical reasoning. They engage with ethics not by deducing moral truths but by replicating how such debates unfold in language. This is achieved through gradient descent, an algorithm that minimises a loss function by updating weights in the direction that reduces error, approximates complex functions that map inputs to outputs, allowing them to generalise patterns from vast amounts of data. They do not deduce answers but generate plausible ones, with ‘reasoning’ emerging from billions of neural network parameters rather than explicit rules. While they primarily function as probabilistic models, predicting text based on statistical patterns, computational logic plays a role in optimisation, rule-based reasoning and certain decision-making processes within neural networks.

But probability and statistics are themselves formal systems, grounded not only in arithmetic but also in probabilistic axioms, such as those introduced by the Soviet mathematician Andrey Kolmogorov, which govern how the likelihood of complex events is derived, updated with new data, and aggregated across scenarios. Any formal language complex enough to express probabilistic or statistical claims can also express arithmetic and is therefore subject to Gödel’s incompleteness theorems. This means that LLMs inherit Gödelian limitations. Even hybrid systems, such as IBM Watson, OpenAI Codex or DeepMind’s AlphaGo, which combine logical reasoning with probabilistic modelling, remain bound by Gödelian limitations. All rule-based components are constrained by Gödel’s theorems, which show that some true propositions expressible in a system cannot be proven within it. Probabilistic components, for their part, are governed by formal axioms that define how probability distributions are updated, how uncertainties are aggregated, and how conclusions are drawn. They can yield plausible answers, but they cannot justify them beyond the statistical patterns they were trained on.

Some fundamental mathematical questions lie beyond formal resolution

At first glance, the Gödelian limitations on AIs in general and LLMs in particular may seem inconsequential. After all, most ethical systems were never meant to resolve every conceivable moral problem. They were designed to guide specific domains, such as war, law or business, and often rely on principles that are only loosely formalised. If formal models can be developed for specific cases, one might argue that the inability to fully formalise ethics is not especially troubling. Furthermore, Gödel’s incompleteness theorems did not halt the everyday work of mathematicians. Mathematicians continue to search for proofs, even knowing that some true statements may be unprovable. In the same spirit, the fact that some ethical truths may be beyond formal proof should not discourage humans, or AIs, from seeking them, articulating them, and attempting to justify or prove them.

But Gödel’s findings were not merely theoretical. They have had practical consequences in mathematics itself. A striking case is the continuum hypothesis, which asks whether there exists a set whose cardinality lies strictly between that of the natural numbers and the real numbers. This question emerged from set theory, the mathematical field dealing with collections of mathematical entities, such as numbers, functions or even other sets. Its most widely accepted axiomatisation, the Zermelo-Fraenkel axioms of set theory with the Axiom of Choice, underlies nearly all modern mathematics. In 1938, Gödel himself showed that the continuum hypothesis cannot be disproven from these axioms, assuming they are consistent. In 1963, Paul Cohen proved the converse: the continuum hypothesis also cannot be proven from the same axioms. This landmark result confirmed that some fundamental mathematical questions lie beyond formal resolution.

The same, I argue, applies to ethics. The limits that Gödel revealed in mathematics are not only theoretically relevant to AI ethics; they carry practical importance. First, just as mathematics contains true statements that cannot be proven within its own axioms, there may well be ethical truths that are formally unprovable yet ethically important – the moral equivalents of the continuum hypothesis. These might arise in systems designed to handle difficult trade-offs, like weighing fairness against harm. We cannot foresee when, or even whether, an AI operating within a formal ethical framework will encounter such limits. Just as it took more than 30 years after Gödel’s incompleteness theorems for Cohen to prove the independence of the continuum hypothesis, we cannot predict when, if ever, we will encounter ethical principles that are expressible within an AI’s ethical system yet remain unprovable.

Second, Gödel also showed that no sufficiently complex formal system can prove its own consistency. This is especially troubling in ethics, in which it is far from clear that our ethical frameworks are consistent. This is not a limitation unique to AI; humans, too, cannot prove the consistency of the formal systems they construct. But this especially matters for AI because one of its most ambitious promises has been to go beyond human judgment: to reason more clearly, more impartially, and on a greater scale.

Gödel’s results set a hard limit on that aspiration. The limitation is structural, not merely technical. Just as Albert Einstein’s theory of relativity places an upper speed limit on the Universe – no matter how advanced our spacecraft, we cannot exceed the speed of light – Gödel’s theorems impose a boundary on formal reasoning: no matter how advanced AI becomes, it cannot escape the incompleteness of the formal system it operates within. Moreover, Gödel’s theorems may constrain practical ethical reasoning in unforeseen ways, much as some important mathematical conjectures have been shown to be unprovable from standard axioms of set theory, or as the speed of light, though unreachable, still imposes real constraints on engineering and astrophysics. For example, as I write this, NASA’s Parker Solar Probe is the fastest human-made object in history, travelling at roughly 430,000 miles (c700,000 km) per hour, just 0.064 per cent of the speed of light. Yet that upper limit remains crucial: the finite speed of light has, for example, shaped the design of space probes, landers and rovers, all of which require at least semi-autonomous operation, since radio signals from Earth take minutes or even hours to arrive. Gödel’s theorems may curtail ethical computation in similarly surprising ways.

No matter how much an AI learns, there will be claims about justice it can’t ever prove within its own system

There is yet another reason why Gödel’s results are especially relevant to AI ethics. Unlike static rule-based systems, advanced AI, particularly large language models and adaptive learning systems, may not only apply a predefined ethical framework, but also revise elements of it over time. A central promise of AI-driven moral reasoning is its ability to refine ethical models through learning, addressing ambiguities and blind spots in human moral judgment. As AI systems evolve, they may attempt to modify their own axioms or parameters in response to new data or feedback. This is especially true of machine learning systems trained on vast and changing datasets, as well as hybrid models that integrate logical reasoning with statistical inference. Yet Gödel’s results reveal a structural limit: if an ethical framework is formalised within a sufficiently expressive formal system, then no consistent set of axioms can prove all true statements expressible within it.

To illustrate, consider an AI tasked with upholding justice. It may be programmed with widely accepted ethical principles, for example fairness and harm minimisation. While human-made models of justice based on these principles are inevitably overly simplistic, limited by computational constraints and cognitive biases, an AI, in theory, has no such limitations. It can continuously learn from actual human behaviour, refining its understanding and constructing an increasingly nuanced conception of justice, one that weaves together more and more dimensions of human experience. It can even, as noted, change its own axioms. But no matter how much an AI learns, or how it modifies itself, there will always be claims about justice that, while it may be able to model, it will never be able to prove within its own system. More troubling still, AI would be unable to prove that the ethical system it constructs is internally consistent – that it does not, somewhere in its vast web of ethical reasoning, contradict itself – unless it is inconsistent, in which case it can prove anything, including falsehood, such as its own consistency.

Ultimately, Gödel’s incompleteness theorems serve as a warning against the notion that AI can achieve perfect ethical reasoning. Just as mathematics will always contain truths that lie beyond formal proof, morality will always contain complexities that defy algorithmic resolution. The question is not simply whether AI can make moral decisions, but whether it can overcome the limitations of any system grounded in predefined logic – limitations that, as Gödel showed, may prevent certain truths from ever being provable within the system, even if they are recognisable as true. While AI ethics has grappled with issues of bias, fairness and interpretability, the deeper challenge remains: can AI recognise the limits of its own ethical reasoning? This challenge may place an insurmountable boundary between artificial and human ethics.

The relationship between Gödel’s incompleteness theorems and machine ethics highlights a structural parallel: just as no formal system can be both complete and self-contained, no AI can achieve moral reasoning that is both exhaustive and entirely provable. In a sense, Gödel’s findings extend and complicate the Kantian tradition. Kant argued that knowledge depends on a priori truths, fundamental assumptions that structure our experience of reality. Gödel’s theorems suggest that, even within formal systems built on well-defined axioms, there remain truths that exceed the system’s ability to establish them. If Kant sought to define the limits of reason through necessary preconditions for knowledge, Gödel revealed an intrinsic incompleteness in formal reasoning itself, one that no set of axioms can resolve from within. There will always be moral truths beyond its computational grasp, ethical problems that resist algorithmic resolution.

So the deeper problem lies in AI’s inability to recognise the boundaries of its own reasoning framework – its incapacity to know when its moral conclusions rest on incomplete premises, or when a problem lies beyond what its ethical system can formally resolve. While humans also face cognitive and epistemic constraints, we are not bound by a given formal structure. We can invent new axioms, question old ones, or revise our entire framework in light of philosophical insight or ethical deliberation. AI systems, by contrast, can generate or adopt new axioms only if their architecture permits it and, even then, such modifications occur within predefined meta-rules or optimisation goals. They lack the capacity for conceptual reflection that guides human shifts in foundational assumptions. Even if a richer formal language, or a richer set of axioms, could prove some previously unprovable truths, no finite set of axioms that satisfies Gödel’s requirements of decidability and consistency can prove all truths expressible in any sufficiently powerful formal system. In that sense, Gödel sets a boundary – not just on what machines can prove, but on what they can ever justify from within a given ethical or logical architecture.

When an AI delivers a decision that appears morally flawed, it may prompt us to re-examine our own judgments

One of the great hopes, or fears, of AI is that it may one day evolve beyond the ethical principles initially programmed into it and simulate just such self-questioning. Through machine learning, AI could modify its own ethical framework, generating novel moral insights and uncovering patterns and solutions that human thinkers, constrained by cognitive biases and computational limitations, might overlook. However, this very adaptability introduces a profound risk: an AI’s evolving morality could diverge so radically from human ethics that its decisions become incomprehensible or even morally abhorrent to us. This mirrors certain religious conceptions of ethics. In some theological traditions, divine morality is considered so far beyond human comprehension that it can appear arbitrary or even cruel, a theme central to debates over the problem of evil and divine command theory. A similar challenge arises with AI ethics: as AI systems become increasingly autonomous and self-modifying, their moral decisions may become so opaque and detached from human reasoning that they risk being perceived as unpredictable, inscrutable or even unjust.

Yet, while AI may never fully master moral reasoning, it could become a powerful tool for refining human ethical thought. Unlike human decision-making, which is often shaped by bias, intuition or unexamined assumptions, AI has the potential to expose inconsistencies in our ethical reasoning by treating similar cases with formal impartiality. This potential, however, depends on AI’s ability to recognise when cases are morally alike, a task complicated by the fact that AI systems, especially LLMs, may internalise and reproduce the very human biases they are intended to mitigate. When AI delivers a decision that appears morally flawed, it may prompt us to re-examine the principles behind our own judgments. Are we distinguishing between cases for good moral reasons, or are we applying double standards without realising it? AI could help challenge and refine our ethical reasoning, not by offering final answers, but by revealing gaps, contradictions and overlooked assumptions in our moral framework.

AI may depart from human moral intuitions in at least two ways: by treating cases we see as similar in divergent ways, or by treating cases we see as different in the same way. In both instances, the underlying question is whether the AI is correctly identifying a morally relevant distinction or similarity, or whether it is merely reflecting irrelevant patterns in its training data. In some cases, the divergence may stem from embedded human biases, such as discriminatory patterns based on race, gender or socioeconomic status. But in others, the AI might uncover ethically significant features that human judgment has historically missed. It could, for instance, discover novel variants of the trolley problem, suggesting that two seemingly equivalent harms differ in morally important ways. In such cases, AI may detect new ethical patterns before human philosophers do. The challenge is that we cannot know in advance which kind of departure we are facing. Each surprising moral judgment from AI must be evaluated on its own terms – neither accepted uncritically nor dismissed out of hand. Yet even this openness to novel insights does not free AI from the structural boundaries of formal reasoning.

That is the deeper lesson. Gödel’s theorems do not simply show that there are truths machines cannot prove. They show that moral reasoning, like mathematics, is always open-ended, always reaching beyond what can be formally derived. The challenge, then, is not only how to encode ethical reasoning into AI but also how to ensure that its evolving moral framework remains aligned with human values and societal norms. For all its speed, precision and computational power, AI remains incapable of the one thing that makes moral reasoning truly possible: the ability to question not only what is right, but why. Ethics, therefore, must remain a human endeavour, an ongoing and imperfect struggle that no machine will ever fully master.

Ethics Computing and artificial intelligence Logic and probability

8 August 2025

SYNDICATE THIS ESSAY