Machine writing is closer to literature’s history than you know

Since its inception in 2015, the research laboratory OpenAI – an Elon Musk-backed initiative that seeks to build human-friendly artificial intelligence – has developed a series of powerful ‘language models’, the latest being GPT-3 (third-generation Generative Pre-trained Transformer). A language model is a computer program that simulates human language. Like other simulations, it hovers between the reductive (reality is messier and more unpredictable) and the visionary (to model is to create a parallel world that can sometimes make accurate predictions about the real world). Such language models lie behind the predictive suggestions for emails and text messages. Gmail’s Smart Compose can complete ‘I hope this …’ with ‘… email finds you well’. Instances of automated journalism (sports news and financial reports, for example) are on the rise, while explanations of the benefits from insurance companies and marketing copy likewise rely on machine-writing technology. We can imagine a near future where machines play an even larger part in highly conventional kinds of writing, but also a more creative role in imaginative genres (novels, poems, plays), even computer code itself.

Today’s language models are given enormous amounts of existing writing to digest and learn from. GPT-3 was trained on about 500 billion words. These models use AI to learn what words tend to follow a given set of words, and, along the way, pick up information about meaning and syntax. They estimate likelihoods (the probability of a word appearing after a prior passage of words) for every word in their vocabulary, which is why they’re sometimes also called probabilistic language models. Given the prompt ‘The soup was …’ the language model knows that ‘… delicious’, ‘… brought out’ or ‘… the best thing on the menu’ are all more likely to ensue than ‘… moody’ or ‘… a blouse’.

Behind recent speculation, such as Stephen Marche’s piece in The New Yorker, that student essays will never be the same after GPT-3, or that the program will once and for all cure writer’s block, is a humbler fact. The main technical innovation of GPT-3 isn’t that it’s a great writer: it is that the model is adaptable across a range of specific tasks. OpenAI published the results of GPT-3’s performance on a battery of tests to see how it fared against other language models customised for tasks, such as correcting grammar, answering trivia questions or translation. As an all-purpose model, GPT-3 does as well as, or better than, other models, especially when prompted by a few instructive examples. If ‘sea otter’ is ‘loutre de mer’ and ‘peppermint’ is ‘menthe poivrée’ then ‘cheese’ in French is …? Being versatile helps GPT-3 produce writing that looks plausible. It takes in its stride even made-up words it hasn’t seen before. When told that ‘to screeg’ something is ‘to swing a sword at it’ and asked to use that word in a sentence, the computer composes a respectable answer: ‘We screeghed at each other for several minutes and then we went outside and ate ice-cream.’

Reactions to this GPT-3 have been circulating in tech circles and the mainstream media alike. There are concerns about the datasets on which the model was trained. Most of the half-trillion words lending GPT-3 its proficiency is internet writing ‘crawled’ and archived by bots, with a relatively small portion coming from books. The result is that machines can output hateful language, replicating the kind of writing from which they’ve learned. Another line of criticism points out that such models have no actual knowledge of the world, which is why a language model, in possession of a great deal of information about word sequence likelihoods, can write illogical sentences. It might suggest that if you’re a man needing to appear in court, but your suit is dirty, you should go in your bathing suit instead. Or, while it knows that cheese in French is fromage, it doesn’t know that cheeses don’t typically melt in the fridge. This sort of knowledge, on which language models are tested, is called ‘common-sense physics’. This sounds oxymoronic, but is appropriate when you consider that the inner workings of these deep learning-based models, though basic, also seem utterly mysterious.

Then there’s the Turing Test. To what extent can machine writing now sound like human writing in general, or mimic a particular author or a kind of writing, as if enlisted to play the party game Ex Libris? Quite well, it turns out. Human raters can’t guess whether some of the computer’s fabricated news articles are the work of a person or a machine much more than 50 per cent of the time – that is, not much better than chance. And what sounds more human than swinging a sword at someone – to ‘screeg’ at them real good – before thinking better of it and making up over ice-cream, with the looming possibility that the swords will be picked up again by sticky hands, and truce suspended?

As you read this, you’re probably thinking: surely human language is more complicated than probabilistic predictions about word order? You’ll wonder at the way great novels are structured, at the vast histories of languages and literature, the words that make us wince. But writing is also inescapably sequential. Zadie Smith once described her writing process like this: ‘I start at the first sentence of a novel and I finish at the last.’ Words follow upon words, and a model that gets word sequences right can get a lot else right about language.

But what about style and, for that matter, literary writing? If so much of what we read and write is predictable, don’t we especially appreciate those moments, whether in conversation or in writing, when phrases are unexpected, surprising and perfectly apropos? In Elif Batuman’s novel The Idiot (2017), her main character Selin observes her fellow native English speakers correcting a native Hungarian speaker who had just mused: ‘If you eat slowly, you can feel the food.’ It’s more idiomatic to say that you can ‘taste’, ‘enjoy’, ‘relish’ or ‘savour’ the food, Selin’s friends point out. But Selin, a budding writer, appreciates the aptness of the improbable phrasing, thinking to herself: ‘I would have never corrected somebody who said “You can feel the food.”’

Language models make our lives easier when we use autocomplete or spellcheck, but their underlying principle can offend our literary sensibilities. Language that is too probable – let’s call it probable language – can’t and shouldn’t be entirely avoided. But too much of it makes for bad style, as we know from the writerly tradition of lamenting clichés. In his essay ‘Politics and the English Language’ (1946), George Orwell complains about writing that sounds like it is ‘gumming together long strips of words which have already been set in order by someone else’. What would Orwell think of computer-generated texts, which gum together words precisely because they’ve been so ordered by others and repeated over and again? More recently, the author Lydia Davis, discussing ‘why we hate clichés so much’, says they ‘don’t reflect your own, very individual person; they are borrowed ideas, in outworn language.’ The list of objections goes on. But then again, if the list goes on, that’s because deriding borrowed ideas is itself an outworn practice. And repetition, habit and convention – even those that are objections to repetition, habit and convention – are exactly what probabilistic approaches to language are impeccable at capturing.

Without being able to jot things down, knowledge was stored and circulated through such oral formulas

Our clichéd distaste for clichés points us in a promising direction, though. To make sense of the dizzying thought of machine writing, churning out sentences purely on the basis of probabilities, we need to understand language models such as GPT-3 not only as advances in AI and computational linguistics, but from the perspective of the interwoven histories of writing, rhetoric, style and literature too. What do probabilistic language models look like against the backdrop of the history of probable language? And what might this historical perspective suggest to us about what synthetic text means for the future of imaginative writing?

In writing my book The Connected Condition (2019) – which explores how British Romantic poets responded to the beginnings of the information age to a world newly connected by communication technologies, from standardised paper forms to telegraphy – I drew inspiration from the work of Walter J Ong (1912-2003). A literature scholar and media theorist who was also a Jesuit priest, Ong offers an indispensable framework for understanding the relationship between language and technology in the Western world, giving us a way to begin wrapping our minds around today’s computer-generated writing.

Ong reminds us that we’ve relied on probable language for much of human history. Before the emergence of writing more than 5,000 years ago, a defining feature of oral culture was thinking and speaking in terms of communal, formulaic language. ‘In an oral culture,’ Ong explains in Rhetoric, Romance and Technology (1971), ‘knowledge, once acquired, had to be constantly repeated or it would be lost: fixed, formulaic thought patterns were essential for wisdom and effective administration.’ Without the benefit of being able to jot things down, knowledge was stored and circulated through such oral formulas. Imagine if we had to navigate the world without the benefit of writing, using only informational mnemonics such as ‘Thirty days hath September,/April, June, and November…’ These formulas invariably included clichés (soldiers were ‘brave soldiers’; the oak tree was ‘the sturdy oak’) and familiar poetic epithets, such as ‘whale-road’ for the ocean. Remembering and communicating compositions depended on such stock phrases: the oral epic poet Homer ‘stitched together prefabricated parts’ and was therefore something like an ‘assembly-line worker’ rather than an auteur. Probable language was initially an oral and mnemonic technology for information storage and transmission.

Through successive revolutions in media and technology, including the spread of writing and the 15th-century invention of printing, probable phrases persisted in rhetorical teaching and practice as loci communes or ‘commonplaces’, which can refer to two things. ‘Analytic’ commonplaces were well-trodden topics or headings for discussion. If discussing an individual, analytical commonplaces would include the person’s place of origin, parents, and vices and virtues; if discussing a phenomenon, analytic commonplaces were causes, effects, and similar and opposite phenomena. The second kind of commonplace, the one more relevant here, is the ‘cumulative commonplace’ or prefabricated phrases or passages. These include frequently used phrases such as ‘tried and true’, ‘sudden change’ or even standardised narrative passages – for example, ‘Once upon a time…’ or ‘It was a dark and stormy night…’

The emergence of writing and print loosened the grip of probable language: each, as a physical medium, allows for the saving of knowledge over time and the spread of knowledge in space, and this liberated human thought and expression from the deep grooves of conventional language. Yet, paradoxically, these frequently used ‘residues’ of oral culture were not extinguished by writing and print, but rather gathered into collections and compendiums that were used to teach students the rhetorical curriculum from antiquity through the Renaissance, and until the decline of that kind of schooling in the 19th century.

In Ong’s sweeping narrative, it was only with the emergence of an artistic movement, beginning around the mid-18th century, that probable language came to be regarded less as the building blocks of composition and more as the too-familiar, the outworn, the boring – basically, closer to how we think about the word ‘commonplace’ today. That artistic movement was Romanticism. ‘Romantic subjectivism exists,’ Ong argues, ‘in contrast with adherence to formulary expression.’ If we recoil, as Orwell does, from language that has stitched together ‘long strips of words which have already been set in order by someone else’, it’s because we moderns are also in this regard Romantics – or, at least, we have inherited the Romantic prizing of artistic originality.

A revealing example of the Romantic rejection of probable language occurs in William Wordsworth’s ‘Preface’ to Lyrical Ballads (1800). The ‘Preface’ was an artistic manifesto of sorts, appended to the volume of poems he wrote in collaboration with Samuel Taylor Coleridge. In the ‘Preface’, Wordsworth quotes in full an elegiac sonnet (‘On the Death of Mr Richard West’) by the 18th-century poet Thomas Gray. In a somewhat violent editorial gesture, Wordsworth pronounces that only the five lines he has italicised are ‘of any value’:

In vain to me the smiling mornings shine,
And reddening Phoebus lifts his golden fire:
The birds in vain their amorous descant join,
Or cheerful fields resume their green attire;
These ears alas! for other notes repine;
A different object do these eyes require;
My lonely anguish melts no heart but mine;
And in my breast the imperfect joys expire;
Yet Morning smiles the busy race to cheer,
And new-born pleasure brings to happier men;
The fields to all their wonted tribute bear;
To warm their little loves the birds complain.
I fruitless mourn to him that cannot hear
And weep the more because I weep in vain.

As Ong points out, Wordsworth and Coleridge dismissed those parts of the sonnet that betrayed residues of cumulative commonplaces, which were still popular in 18th-century poetry. In fact, writers then believed that such artistic clichés, called ‘poetic diction’ – for example, the widespread practice of calling birds ‘feathered friends’ – were a defining stylistic feature of literary language. But for the Romantics, such phrases and similar ones found in Gray’s sonnet, such as ‘smiling mornings’, ‘reddening Phoebus’ (another way to describe the morning) and ‘cheerful fields’ – were hand-me-downs, robotically recycled by poets merely because ‘pre-established codes of decision’ deemed them the proper language of poetry.

These formulas had taken poetry too far from what Wordsworth, in particular, believed to be the language of everyday conversation. Literature, he believed, could be expressed in commonplace language (in the sense of the ordinary spoken language and written prose), rather than through cumulative commonplaces. Wordsworth and Coleridge insisted that only the relatively plain lines expressing the poet’s grief, and whose phrases aren’t hackneyed, were the stuff of real poetry: the expression of an individual artist. In Ong’s words, the outlook of the Romantic poet was that ‘the better he or she was, the less predictable was anything and everything in the poem.’ The kind of originality we associate with imaginative writing is rooted in Romanticism, which was reacting against probable language.

Contemporary language models reveal to us that our plain style is itself full of highly probable phrases

Of course, the Romantics weren’t as free from convention as they declared. Like all artists, they negotiated between convention and innovation. Still, the history of probable language leaves us with much knottier questions about today’s language models, such as GPT-3, which can seem as if they have some capacity to think and to author. If we have been thinking, uttering and writing probable language for much of our history, probabilistic language models mimic earliest humanity: the way computers can now write resembles how humans first spoke.

To make matters more interesting, our writing today has followed Romanticism in shaking off the deep tradition of cumulative commonplaces. The modern, plainer style – variations of which we all write – is, in Ong’s terms, ‘the style of highly evolved and literate folk who have divested their language of some of the most typical and ornate features it had inherited from more primitive times.’ The stylistic ideals of clarity and brevity exert far greater pressure on our writing than the need to adhere to traditional commonplaces. Even so, one of the stranger effects of contemporary language models is that they reveal to us that our plain style is itself full of highly probable phrases. They are perhaps less conspicuous than phrases such as ‘feathered friends’ or ‘smiling mornings’, but they’re no less likely – for example, ‘so far, so good’ or ‘unprecedented times’. In other words, our post-commonplace writing is actually full of commonplaces. And beneath our own stylistic commonplaces are strata upon strata of earlier commonplaces by which we organised and navigated the world.

Depending on what one believes about art and politics, one can conclude, anxiously, that technologies such as machine writing threaten human originality (an idea stemming from the Romantics) or that machine writing is helping to topple that very focus on the individual, creative author. But Ong’s account actually suggests a different conclusion too, insofar as Romanticism was not antithetical to writing technologies but an outgrowth of them. These older technologies of writing – from handwriting to print – freed up the human mind from the burden of information storage so that we could be more creative. Likewise, today’s text technologies, which can generate serviceable writing, need not kill off the idea of human originality so much as reinvigorate it – a new Romanticism. One that can appropriate, manipulate, play with, make fun of, even reject whatever machine writing ends up being. And if human authors seem to have the last word, a bigger and better language model will inevitably come along and consume all that new writing. Then writers will innovate again, and on and on.

What does it mean to compose with language models? The partnership strains language itself. And yet today’s automated writing pointedly recalls a variety of modernist and later artistic experiments, which further complicate ideas about individual human authorship and art. Consider the imposition of constraints made by the French literary group Oulipo – for example, Georges Perec’s novel La Disparition (1969), which excludes the letter ‘e’, or Jacques Jouet’s ‘metro poems’, hurriedly composed between stops – or its aleatory methods, as in Raymond Queneau’s Cent mille milliards de poèmes (1961), where lines from 10 sonnets can be recombined to make a 100,000 billion other sonnets. The Surrealist’s collaborative compositional algorithm, known as Exquisite Corpse, also comes to mind: each player sequentially contributes a portion of an image or text without seeing what came before. The human prompt followed by GPT-3’s response is very similar.

Photo of a book page titled “Introduction” with subheading, containing a passage discussing political propaganda and civil unrest. — The introduction to George Perec’s novel *La Disparition* (1969), translated into English as *A Void* (1995) by Gilbert Adair. *Courtesy the Internet Archive*

Some commentators want to naturalise the partnership, arguing that language models are no more than utilitarian implements. After all, grammar and spelling checkers have been using machine learning since the 2000s, and computer-generated text is an extension such affordances. Besides, polishing the rough output of synthetic text is like working with a student in need of a human editor. K Allado-McDowell’s recent work, Pharmako-AI (2021), is a philosophical dialogue between the author and GPT-3’s synthetic text, which involves Allado-McDowell ‘pruning the [model’s] output back in order to carve a path through language’ – presumably, this means a lot of editing in the form of deleting computer-generated text until a suitable or coherent sequence emerges.

Others dwell on the strangeness of this human-machine dialogue: the novelist Elvia Wilk suggested in The Atlantic recently that collaborating with a language model is like interacting with a nonhuman consciousness – like that of dolphins – with which we try, haltingly, to bridge the communicative gap. On this view, machine writing is a non-anthropocentric form of creativity (although it’s worth remembering that language models, unlike dolphins, are human-made). I suspect that the nature of the collaboration isn’t easy to pinpoint because the collaborator itself is difficult to describe. Whatever it is, it is a human-nonhuman entanglement at every level: to write with a language model entails interacting with the computational-linguistic mindset that goes into language modelling; with human-made but nonhuman bots that are harvesting internet writing; with the anonymised mass of human and nonhuman writing flooding the internet, on which models are trained; and so on.

We collectively long for the banal; there is an elusive ratio between the familiar and the fresh

Playing around with machine-generated writing, and thinking about its artistic implications, I’m reminded of People’s Choice, a project by the Russian-born, New York-based visual artists Vitaly Komar and Alexander Melamid. Starting in 1993, Komar and Melamid, in partnership with the Nation Institute (now Type Media Center), hired a public opinion research firm that conducted telephone surveys with 1,001 Americans about their aesthetic preferences and habits. There were nearly 100 questions. What is the interviewee’s favourite colour? How clothed are the figures in the works of art that you like (nude; partially clothed; fully clothed; depends; not sure)? Indoor or outdoor scenes? What is a good size for a painting? What is your opinion of Picasso, Monet or O’Keeffe? How many times a year do you visit art museums? After analysing these data with computers, Komar and Melamid hand-painted America’s Most Wanted Painting and America’s Most Unwanted Painting, which they revealed in 1994. In subsequent years, they conducted similar ‘surveys’ in 13 more countries, including China, Denmark, Finland, Germany, Iceland and Kenya, each time painting the same pair of most wanted and unwanted works, all based on the survey responses collected.

As for the most wanted works themselves, they’re bland, kitschy and – with the exception of Holland’s – strikingly similar to one another. They’re also mesmerising, a seeming distillation of what people most want from art. It quickly becomes clear that the artists take interpretive liberty with the most preferred aesthetic characteristics in order to guarantee that all their ‘most wanted’ paintings resemble one another – that they collectively reveal the ‘sameness of [the] majority’. As Melamid observed: ‘in every country the favourite colour is blue … everywhere the people want outdoor scenes, with wild animals, water, trees, and some people.’

In an interview with the US journalist JoAnn Wypijewski in Painting by Numbers (1997), Komar and Melamid reflect on their creations, derived from massive amounts of data about people’s habits and preferences. The art might be ugly – deliberately, mischievously so – but it reminds us of the grand questions at stake. What are the implications of the most frequent, familiar and popular for taste, creativity, even democracy? What is artistic freedom, and to whom or what is art answerable? People’s Choice underscores that we collectively long for the banal and that there is an elusive ratio between the familiar and the fresh. It reminds us that pursuing the utopian ideal of a true ‘collaboration with the people, a furthering of [the] concept of collective creativity’ through the most probable is also inevitably a ‘collaboration with [a] new dictator – Majority.’

In the age of machine writing, a new Romanticism might caricature that dictator – ‘Majority’ – as Komar and Melamid did, stringing together our most commonplace phrases. Or, it might – as Ong suggested of Wordsworth and Coleridge – use the commonplace as a springboard for something novel. Or maybe something else altogether will emerge. But at the heart of the matter is the question put by Komar: ‘Do you expect to see the unexpected when you look at art?’

Stories and literature Computing and artificial intelligence Future of technology

9 September 2021

Post

SYNDICATE THIS ESSAY