In the movie Arrival (2016), a seven-limbed alien species lands on Earth with a language that no human can understand. The aliens – dubbed Heptapods – are obliging enough to provide room in their spaceship for linguistic exchanges, but the team charged with translation is baffled. The creatures write in sentences that look like circular smoky inkblots, unlike anything on our planet.
The movie’s drama – based on a story by Ted Chiang – rests on the sheer strangeness of the Heptapod language, but it’s actually not as alien as it could be. Apart from the sci-fi twist that learning it imparts special abilities, the Heptapod language is not very different from ordinary human languages. The symbols are strange and circular, sure, but they still stand for words belonging to familiar grammatical categories like nouns and verbs, and can be translated into English. In fact, a major plot element in the movie is the mistranslation of a Heptapod noun meaning ‘tool’ as ‘weapon’.
The situation is similar with several other nonhuman languages in fiction. Consider Klingon from Star Trek, now spoken by several Earthlings. Klingon’s claim to alienness is that it contains a peculiar set of sounds and an unusual sentence structure. But, like human languages, it still contains nouns and verbs, and the same structural elements, like subject and object. The same is true of other fictional languages like Dothraki (Game of Thrones), Na’vi (Avatar) and Quenya (The Lord of the Rings).
Even outside fiction, imaginations are rather impoverished. The development of constructed languages (referred to as ‘conlangs’) for fictional and other purposes draws primarily from linguistics. But, as a science, linguistics generally focuses on discovering the general rules governing actual, observable human languages – their sounds, symbols or gestures, their grammar, the elements and structure of their sentences, the meanings of their expressions, etc. And while conlangs may have unique vocabularies or flout one or more rules of human languages, the formula for creating one essentially involves adapting familiar elements from how Earthlings communicate.
As a philosopher of language, I find this unsatisfying. The space of possible languages is vast, and full of exotic languages that are much weirder and stranger than any we have yet imagined. We should explore what those might be – and for more than intellectual curiosity alone. If we one day encounter aliens through first contact or a signal sent across the galaxy, their language might be nothing like ours. After all, humans have evolved with certain cognitive abilities and limitations. Expecting intelligent beings with alternative origins to use languages like ours betrays an anthropocentric view of the cosmos. If we want to move beyond exchanging prime number sequences to figuring out what the extraterrestrials are actually saying, we need to be prepared.
To approach the space of possible alien languages, we must first consider the building blocks from which a language is constructed – and how these can differ. A language can be thought to contain four levels, corresponding to four features of human languages: sign, structure, semantics and pragmatics. While an alien language might not share all these features, it’s helpful to know what they are before we venture into extraterrestrial strangeness.
At the first level are the signs: things that we produce, observe or exchange. The sounds we make using our vocal apparatus are signs; so is the alphabet you are looking at right now. Pictograms like an emoji or a ‘No smoking’ symbol also belong to this level, as do logograms like some Chinese characters or Egyptian hieroglyphs, as well as the gestures of a sign language. Depending on the physical constitution of a creature, its language may employ a wider variety of signs than humans. While the language of nonhuman animals isn’t complex, its signs may include smells or body movements. Even machines can employ signs, such as the high-pitched sounds of Gibberink, a language that some artificial intelligences use to interact.
The elements of language can reach out and be ‘about’ things other than themselves
The second level is structure, and concerns word structure, grammar and syntax. A word is one structural element of a language, and so is a sentence. Further, words themselves can belong to grammatical types (eg, nouns, verbs, pronouns) and so can sentences (eg, questions, commands, declarations). So, for example, if you were analysing the words in the sentence Peacocks eat insects, you can identify the nouns and verb, and that it’s a declaration not a question. Individual words also have an internal structure (eg, they can have suffixes, or other ways of marking case, tense, number, gender, etc) and so do sentences (eg, English sentences typically have a subject-verb-object structure, while those of Sanskrit have subject-object-verb). If you pluralised insect using ‘-s’ as a prefix not a suffix, it would violate a rule of English word structure, and if you said Eat insects peacocks, it would violate a rule of English sentence structure.
The third level is semantics, which concerns meaning. Among the most enigmatic things about language is that its elements can reach out and be ‘about’ things other than themselves, such as objects of the world or abstract concepts. The word mammoth is more than a collection of its letters: it refers to an elephant-like creature with tusks, which once roamed Earth. And the sentence Mammoths are extinct is more than a collection of words: it manages to say something true about the world.
The fourth level is pragmatics, which concerns how language users can say things that go beyond the literal meanings of the words they produce. When people use the idiom I could eat a horse, they are expressing hunger, not their attitude towards horses. And if someone in an action movie says We need to call Washington, the use of ‘Washington’ is metonymic: it is a government, not the city. Sometimes, pragmatic phenomena allow us to convey meaning without breaking social norms. For example, if someone asks you for a dance, it can be more polite to say I am here with my partner than simply saying no. Such ‘implicatures’ – along with metaphors and metonyms – belong to this level.
One recipe for creating an alien language could be to pick an arbitrary human language and make modifications to its rules concerning signs, structure, semantics or pragmatics. Perhaps the easiest is to make changes at the first level: signs. One could cook up an alien language by adopting an arbitrary set of phonemes that no spoken human language has. For a written language, one could choose an entirely new set of symbols.
If you were feeling particularly imaginative, this language could include signs from a completely new modality of articulation, like movements of the body (think bee dances) or electrical impulses, which was presumably used by the robots in the Hollywood movie AI Artificial Intelligence (2001) who communicated by touching each other. In principle, any familiar Earth language can be used as a ‘base’ to construct such a language. The resulting alien language will seem very different from the base language, even when the alien language is identical to the familiar language at the level of structure, semantics and pragmatics.
Why must the structural elements of an alien language be the same as human languages?
A more elaborate recipe would be to construct an alien language with a structure that is different from that of familiar human languages. Word and sentence structure differs across human languages: some use prepositions while others use postpositions for the same purpose; some languages have a dedicated word to indicate definiteness (eg, the English definite article ‘the’) while others like Swedish use suffixes to achieve the same effect (eg, Vetenskapsrådet, the word for the Swedish Research Council, indicates definiteness using the suffix ‘-et’). The word order in a sentence can vary across languages too. A majority of languages have sentences beginning with a subject – a referring expression like a name or a noun phrase like ‘the tiger’ – but a tiny minority have them beginning with a verb.
One could cook up an alien language by picking and choosing grammatical rules from different languages in a systematic manner. In fact, the typical formula for constructing a fictional conlang involves putting together a potpourri of grammar rules, along with an exotic system of signs and a lexicon.
But there is scope to stretch our imagination further. Why must the structural elements of an alien language be the same as human languages? An alien language may lack words of some grammatical type: it may, for instance, lack nouns. Such a language may ‘nominalise’ verbs, adjectives or other parts of speech in a way analogous to the gerunds of English. For example, the subject noun in Misgendering is a sensitive topic is a nominalisation of the verb ‘to misgender’.
Or it may have a single category corresponding to two or more grammatical types typically found in human languages. It has been claimed that even some human languages are ‘lumpers’, meaning they lump grammatical categories together. For instance, Salishan languages of Northwestern North America are claimed to lack a noun/verb distinction, and Quechua, a language spoken in the Peruvian Andes, is claimed to lack a noun/adjective distinction. These claims are still controversial, but they give us all the more reason to drop the simplistic assumption that the words of an alien language should belong to the same grammatical types that words of human languages do.
We can also imagine languages all of whose words are of the same type, and thus cannot be categorised into distinct grammatical types. Philosophers have been toying with such languages for some time. In his Tractatus Logico-Philosophicus (1921), Ludwig Wittgenstein proposed a ‘logically perfect’ language whose elementary sentences contain only simple signs, analogous to names like Xi Jinping, Colosseum or Bogota that refer to individuals, objects and places. Wittgenstein thought that the sentences of the language are true when the configuration of names that they contain correctly ‘picture’ or represent the world.
An alien language may be less like English, Swahili or Cantonese, but more like maps
But the world is more than a mere collection of objects. The objects have properties, relate to each other in different ways, and are themselves arranged in complex patterns in space and time. For example, the Eiffel Tower is tall, Socrates was the teacher of Plato, and Trump was seen sitting between Zelenskyy and Vance. How can a mere configuration of names represent such facts, which include patterns, properties or relations? Wittgenstein himself thought that the world – including properties and relations – was itself an enormous collection of objects arranged in different configurations. But even if we reject Wittgenstein’s quixotic view about what the world is like, we can imagine that the logical language can represent facts using, again, sequences of names. Alternatively, the logical language could represent them using different distances between names, or different spatial arrangements of names in a sentence. However one decides, such ‘Wittgenstein languages’ are plausible candidate languages whose words belong to a single grammatical type: a noun. An alien language could behave similarly.
But for a language to exhibit true alienness at the structural level, it would need to have words belonging to entirely new grammatical types that no familiar language has. For instance, it may not have words and sentences, and, if it does, they may belong to grammatical types not found in any human language.
Non-linguistic systems of representation may hint at what these alien grammatical types could be like. For instance, an alien language may be less like English, Swahili or Cantonese, but more like maps. It is difficult to identify words or sentences on a map. One may think that the icon for a church is a sign that corresponds to a noun phrase, such as ‘the church’. However, there are so many differences between the properties of map elements and linguistic elements that they cannot be considered the same. For example, map elements ‘update automatically’: changing the location of the church icon on a map automatically updates its distance from all other elements on the map. But there is no corresponding phenomenon in natural language. Why couldn’t an alien civilisation have a ‘map-like’ language whose words belong to grammatical types that no human language has?
Languages that differ at the level of sign or structure will appear quite alien. But they may still be translatable into a human language like English. Such alien languages may have elements that refer to the same objects, or convey the same meanings that some English words and sentences do. The sentence Aristotle Plato Socrates in a Wittgenstein language may have the same meaning as the English sentence ‘Plato is the teacher of Aristotle, and Socrates of Plato’ (if, say, the space between names represents the relation ‘is the teacher of’). An icon of a church in the middle of a green patch on a map may translate into English as ‘The church is located in a park.’ It may be possible to match elements of two languages that differ only at the first and second level as translations of one another.
But languages that are different from human languages at the third, semantic level will raise problems concerning translatability. One problem could be that an expression from one language has a meaning that no single expression in another language has. To some extent even different human languages exhibit untranslatability of this kind. A human language may have a noun or verb with a meaning that no noun or verb of another human language does. English speakers would struggle to find a direct translation for the German Fernweh – a melancholic ache to be in faraway places – just as I feel at a loss in my native language of Hindi when seeking a word for serendipity.
We can also imagine alien languages that raise this sort of problem. Some elements of an alien language may be untranslatable into human languages because they refer to an alien sentiment or some astronomical object that the aliens have discovered but we have not. However, this problem is, in principle, solvable. The object or phenomena described by a noun or verb from one language can be described using a collection of words (if not a single word) from another language. And if we have not yet discovered the object or phenomena described by some elements of an alien language, we may make further discoveries and expand our knowledge of the world, and coin new words or phrases that describe them. These new words in our language will then serve as translations of the alien elements.
But a second, radical problem of untranslatability would arise when one or more expressions of a language have a kind of meaning that no expression of another language does. As humans with a certain set of evolved cognitive abilities, we perceive the world as structured in a certain way. For instance, we perceive it as containing objects, actions, properties and processes. The kind of meanings that the elements of our languages have reflects our way of structuring the world. For instance, proper nouns have objects as their meaning, verbs refer to actions, and whole sentences represent facts.
Even if their language is very different from ours, the aliens may need concepts like ‘truth’ and ‘falsity’
But what if an alien species, which has evolved differently, perceives the world totally differently? The language of this species would reflect its own way of classifying the world with categories that we do not have the cognitive capacities to grasp. Unless we know what element of the world a fragment of an alien language corresponds to, coining a new word in our language would not help. The elements of such an alien language are radically untranslatable, not because we cannot know what they mean, but because we would not even know what kind of meaning they have. In his book Alien Structure (2024), the philosopher Matti Eklund argues that it is indeed possible for such radically untranslatable languages to exist, and does so even without appealing to differences in cognitive apparatus.
But not all hope is lost. Some fragments of alien languages that differ from human languages at the third level of semantics may still be translatable. For example, describing the world, specifying when a description is true or false, or issuing commands, are all behaviours that any sophisticated society of intelligent creatures would plausibly need to engage in. Even if their language is very different from ours, the aliens may need concepts like ‘truth’, ‘falsity’, ‘description’, ‘question’, etc.
If so, these concepts can serve as points of contact between mutually untranslatable human and alien languages, and can be used to match up elements that serve the same purpose. If we identify a set of elements in an alien language that serve as descriptions or commands, we can match them up with our indicative or imperative sentences. And if we can figure out when the aliens take a description to be ‘true’ or ‘false’, we may even be able to narrow it down to a set of sentences that correspond to the alien description (even when we do not know the exact meaning of the words in the description).
We may also try to find cognate concepts in our language that most closely match up with alien concepts. An interesting analogy here is an analysis of the concept of ‘mass’ proposed by the physicist and philosopher Thomas Kuhn. He argued that Albert Einstein changed the concept of ‘mass’ so much from Newtonian mass that the two are incommensurable and untranslatable (the amount of Newtonian mass in the Universe is constant, but Einsteinian mass is convertible with energy). However, the fact that we can understand both Einsteinian and Newtonian physics should give us hope of being able to understand the concepts expressed by an alien language, especially if the differences are not too radical and too pervas ive.
In principle, differences at one level do not necessitate difference at another. Recall that Klingon employs different signs and a somewhat different structure, but mirrors human languages at the other levels. But, in practice, we should expect real alien language to differ at multiple levels. Aliens who perceive the world to be structured differently would also have a language with grammatical and syntactic elements different from familiar languages. So, it is quite unlikely that aliens have a language that is different only at the fourth, pragmatic level, while being just like English at the other levels.
But, for argument’s sake, imagine creatures whose language is just like English at the first three levels, but who have settled on a different set of metaphors, metonyms, or have communicative norms that are different from ours. These creatures can ‘eat a horse’ if they are happy, and one is ‘a chicken’ if they are prone to run in a zigzag manner. Further, human languages have implicatures, which communicate things that go beyond what is literally expressed (like the sentence we saw earlier: I’m here with my partner, which serves as a polite refusal of a dance invitation). The norms governing alien communication may be different, resulting in implicatures of a different kind. Unless we know the relevant conventions, we will not understand fragments of this language.
But if pragmatic differences come on top of semantic differences, things can get a lot more interesting. An alien species may have alien semantic conceptions corresponding to concepts like ‘communication’, ‘metaphors’, ‘norms’, which play a central role in our understanding of pragmatics. If so, alien pragmatics can be much different, and weirder.
A noteworthy example here is the language of the Tamarians, an alien species from Star Trek: The Next Generation. In the episode ‘Darmok’ (1994), Captain Picard’s Enterprise makes contact with a Tamarian spaceship, but the crew struggle to communicate. Starfleet’s ‘universal translator’ can provide only literal translations of Tamarian expressions. One of the aliens keeps repeating the sentence Darmok, on the ocean, but Picard cannot understand its meaning. It turns out the Tamarians do not consider the literal meanings of their words important – in fact they do not even attend to them. Instead, they attend to allegorical associations between expressions and concepts rooted in the myths and history of their culture. Thus, the expression Darmok, on the ocean communicates something in the vicinity of loneliness on a quest, by reminding them of a story with a certain Darmok and their ordeals at sea. Someone unfamiliar with the story, or the fact that the story has this meaning – instead of, say, drowning – would not be able to communicate with the Tamarians.
The language of an alien species who can communicate telepathically will not have the first level of signs
It seems highly unlikely to me, if not outright impossible, that an alien species can ignore the literal meanings of their expressions, and focus only on the allegorically conveyed concepts. The Tamarians do use the literal meanings of Darmok and ocean to figure out which story is referenced. So, it would seem terribly wasteful not to use the literal meanings to communicate.
But the example does suggest an interesting possibility: that an alien language may lack a level that human languages have. The language of an alien species who can communicate telepathically, for example, will not have the first level of signs. Aliens who have the cognitive capacity to remember an infinite number of signs (eg, a name) – each standing for a distinct meaning – would have no use for the second level of structure. Human languages, by contrast, have a structure because, despite our limited memory and cognitive abilities, it helps us create infinitely many sentences using finite elements.
An extraterrestrial language that lacks the third, semantic level would be particularly alien: one whose elements are not ‘about’ anything. Its words do not refer to objects nor are its sentences true or false descriptions of the world. Creatures that use such a language would be causal mechanisms that hook up with the world by way of environmental inputs – eg, smell, temperature or radiation – to produce resultant outputs. ‘Communication’ between such creatures may be a series of causal transactions: a stimulus from one causing a response in another, much like how hormones work in our bodies.
This should remind us of the interaction between machines, which is also causal in nature: the computer on which you are reading this article interacted causally with the Aeon server to bring this article to your screen. But does this interaction amount to linguistic communication? Is a language that lacks semantics even a language? It is difficult to give a simple answer to this. But these are the sort of questions that we should expect to encounter when meditating on alien modes of communication. We expect an encounter with aliens to challenge our conception of what it is to have a body, what it is to be conscious, what it is to be a living creature, and what it means to behave intelligently. So, in thinking about alien modes of communication, shouldn’t we be exploring possible languages that are very different from our own – so much so that they challenge our very conception of what a language is?
Alien modes of communication may also have additional levels, ones that we cannot yet foresee. Perhaps there is an affective level that can encode how exactly one feels – say, the nature and intensity of one’s pain. Or a phenomenal level that can encode qualitative experiences, such as an apple’s redness. Growing out of our anthropocentric bubble to explore how aliens might communicate will equip us better for a potential first contact scenario. But it will also make us more reflective about, and potentially improve, one of the greatest assets that our species possesses: language.