French chefs take part in a videoconference with President Emmanuel Macron at the Élysée Palace in Paris, 24 April 2020. Photo by Ludovic Marin/Reuters


Zoom and gloom

Sitting in a videoconference is a uniformly crap experience. Instead of corroding our humanity, let’s design tools to enhance it

by Robert O’Toole + BIO

French chefs take part in a videoconference with President Emmanuel Macron at the Élysée Palace in Paris, 24 April 2020. Photo by Ludovic Marin/Reuters

Technology exists to expand and sustain our capabilities. Therefore, doing technology well contributes to our hopes for leading an ethically good life: developing the right capabilities in the right ways – and using them for good ends. Videoconferencing could make a significant contribution to this. However, the essential capability I’m concerned about is not videoconferencing in itself, but rather the humanisation of technologies for everyone’s benefit. This is, I argue, one of the most pressing issues we have to deal with, as technology becomes ever more entangled into our lives. But to do so successfully, we need to think more deeply and creatively, using techniques from the interdisciplinary field we call design research – applying a blend of psychology, philosophy, anthropology, engineering and aesthetics.

In this essay, I will explore how the experience of videoconferencing points, in one way, towards the limits of human adaptability and, in the other, to a liberating human capability that we must collectively cultivate and sustain – as an innovative extension to the ethical framework described by the philosopher Martha Nussbaum in Creating Capabilities (2011). As the designer Jon Kolko says in Well-Designed (2014), we should adopt an ‘optimistic stance’ and ‘seek to explore the situation space, to see multiple potentials for improvement, and to always consider what might be’ through systematic empathy and ‘integrative thinking’.

How important is this? The impact of bad design is massive – multiplied across the billions of people striving to flourish online, not only when we need the technology as a stop-gap measure to alleviate disruptions to everyday life, but also when we want to use it for more socially progressive purposes. Videoconferencing, and the telepresence experience it can enable, promises to facilitate the formation of communities and partnerships that escape the boundaries of time and distance. It can even help us overcome social and economic barriers. School teachers, for example, can struggle to achieve meaningful dialogue with busy working parents. Governments are breezily assuming this is made easy with technology. Teachers are discovering the truth. But it could be better.

In many ways, the impact of the technology could be revolutionary. It could help us create and sustain capabilities of the kind identified by Nussbaum: a case can easily be made for its positive contribution to any of her list of 10. The more seemingly foundational capabilities, including ‘life’ and ‘bodily health’ are clear, for example, in enabling remote access to medical services. But videoconferencing could also expand the development of ‘senses, imagination and thought’, ‘practical reason’ and ‘play’ through widening access to cultural and educational opportunities. In countries where the internet isn’t tightly policed by the state, it’s having a huge impact on the capability for ‘affiliation’ between people who wouldn’t be able to collaborate otherwise.

But what of the expression and development of ‘emotions’, Nussbaum’s fifth capability? In education (my field), there is a need to include an emotional dimension to what we do, to communicate to students (and academics) that emotions are essential ingredients, not to be ashamed about but to be developed. My colleagues at the Institute for Advanced Teaching and Learning at the University of Warwick have worked hard to create pedagogical practices aligned with this in traditional teaching spaces. But we find it harder to make this work online at a distance. Videoconferencing could be great. If only we could get the thing to work properly. But for now, it’s quite often just painful.

How do we improve things? How do we re-humanise videoconferencing and other essential technologies? The tech industry’s answer is to add more features. But that’s not sufficient, and most often makes things worse. To start with, we need theories about being human and, more precisely, being human with technology – theories that can guide us in designing better technologies and their use. This is nothing new. The philosopher Martin Heidegger redefined being human as ‘being there’, amid the cultural, material and technical complex through which we make ourselves and are made by the world. He turned philosophy’s attention towards mundane tools, materials and practices: hammers, nails, wood. Basic tech. This is the ‘equipmental totality’ through which our projects are materialised, and which brings meaning to our lives.

There is, however, good reason to distance ourselves from Heidegger’s instrumentality, which can slide into viewing every aspect of the world, including people and animals, as mere equipment. Heidegger is also criticised for presenting a too-perfect picture of reality, what the archaeologist Ian Hodder in Entangled (2012) calls ‘a somewhat romantic view of an integrated being-in-the-world characteristic of rural Volk prior to modernisation’. Equipment in this world fits together, gives meaning in the hands of human agents, and even breaks down in straightforward ways. What would Heidegger have made of videoconferencing? It can trigger an existential crisis in people who are used to this kind of technology. What would it do to Heidegger, I wonder?

Hodder counters Heidegger with the idea of being human as being ‘entangled’ with many diverse things, each of which has a life of its own. The anthropologist Tim Ingold has argued that we can understand humans only through the complexity of their entanglements. Being human is a mass of entanglements. Thinking, as the philosopher Andy Clark has argued in Being There (1996), doesn’t take place in a perfect computer-like black-box processor within the brain, but rather happens through objects in the world. Cognition, especially creativity, is extended across and dependent upon a web of material things. This world of things doesn’t so much fit together into a totality. It temporarily aligns on journeys through time and space, snagging and slipping here and there. It’s not a perfect vehicle, but it works for us. And we can enjoy it.

We can even get pleasure and meaning from being entangled in things that don’t quite do what we expect, but that we can make sense of. Such things surprise and delight us in unpredictable ways. But we can still get a sense of direction, achievement and progress, even when things break down. Our entanglement with things holds us within a rich and flexible fabric, rewoven as we go. Out of this embeddedness, we gain an ability to cope, to adapt, to keep things flowing and create, because it feels real and worthwhile. The feeling of being entangled in real things gives us our sense of relationality, tension and agency – key human experiences. Videoconferencing needs to enable those experiences, not block them. How can we fit technologies to this messy, complex, entangled humanity? Designing for entanglements is the answer. We need to understand entanglement and technology. Let’s explore this further through the case of videoconferencing.

I’ll begin with a pessimistic view, delving into early 21st-century despair. Apologies if this seems too depressing. We’ll get over it, and go on to explore some much more positive experiences that provide useful insights. Here’s my own personal account of this misery, and what we can learn from it.

It’s Monday afternoon, and raining a dark, grey drizzle outside. I’m sitting as comfortably as possible in my ad hoc home office and, at the same time, feeling deeply uncomfortable in a virtual ‘room’ full of academics. Or at least I assume they’re there. I might just be a brain-in-a-vat fed tiny fragments of sense data, and hallucinating the Other. Videoconferencing often feels like that.

My job here is to convince the audience that teaching online is OK. That they can do it. That students can benefit from it. And that they might even enjoy how it liberates them from the timetable and the daily commute. Online distributed education (and research) is potentially a great thing. It can radically widen and make more equal access to learning. It allows us to bring people and resources into our classrooms, physical or digital, from anywhere in the world. But, understandably, these guys are not entirely happy with it. Academics are well attuned to specific ways of being the successful people they are. On the one hand, they live in a slow world of longer-form text and publishing processes. On the other hand, they are adapted to the live-ness and flexibility they have in their natural habitat, the physical campus. Their lives are usually full of regularity, structure and, we might say, ritual, constructed to carefully manage the diversity, complexity and inevitable conflicts of life at the edge – academia, a balance of order and chaos.

The anthropologists Tony Becher and Paul Trowler describe it well in their classic study Academic Tribes and Territories (2001). As with any tribal system, order and chaos are kept in balance through familiar behaviours, some of which are ritualistic rather than functional. In Anthropology and/as Education (2018), Ingold richly describes the experience of being in academia – as researchers, teachers and students – as moving together intimately through a complex environment in which we continually draw attention to things for each other, so as to gradually enrich our shared understanding and capabilities. He calls this being in correspondence, as opposed to a linear transmission of information. Such relationships are established and smoothly maintained in the manner of people walking together, entering into spaces, marking transitions of focus, running through patterns of activity, noticing, gesturing towards and away from, in a way that gracefully manages the complexity of people and their differences. This comes so naturally to us in our everyday lives that we often don’t notice it happening, and can find it hard to describe – all of the tacit details without which academic life, and learning, would be impossible.

This virtual world is new and different, and not supportive of the kinds of correspondence that we’re used to. We can use emojis and mechanisms such as the ‘raise hand’ button, but it’s just not fluid enough to achieve meaningful entanglement across the void. I encourage my students to use visual gestures on camera. Some school students, I know, have developed a sign language so that they can indicate their emotional states without interrupting the flow of a lesson. That’s good, but takes time and will to adopt. Without that, I guess that my collaborators are probably having as bad an experience of videoconferencing as I am. We get a glimmer of light when a ripple of laughter spreads, falteringly, in response to a video feed somehow appearing upside-down. That’s the kind of thing we need to cultivate (laughter, not upside-down video). But I suspect that the participants are too nervous to let go properly. And the laughter isn’t accompanied by the visual cues that we use to feel confident about its meaning and appropriateness. I make a mental note: practice ice-breaking and ways to reduce the formality.

Calm matters in technology. It should be one of the main design goals

Perhaps we need a new way of laughing together? I see my children doing that online, when playing games with remote friends. It should be possible here. We are, however, in an environment that blocks our well-established human ways of interacting and knowing, of acting in a way that creates enough togetherness and regularity so as to make difference and newness possible. Gesturing to draw attention makes no sense when we are without fixed points of reference. I’ve noticed that children interact well online when playing 3D multiplayer games together. That injects points of reference, relational mobility and a little gestural free play into their interactions. We need that. For now, we can add a little relationality using shared digital artefacts or using videoconference views that place participants into stable locations in virtual spaces (eg, Together mode). But it’s limited, and we need to adapt to it.

Is this as alien as living on the Moon, perhaps beyond the limits of human adaptability? Compared with this, life on the lunar surface would be bliss – calm and unpressured. Teaching through the medium of a small glowing rectangle has neither of those characteristics. Life is being squeezed through the low-bandwidth channel of webcam frames, text-chat streams, emojis and scheduled meeting time-slots. The simplifying effect of that compression acts only to increase anxiety, to make us all feel under pressure. This is not calm.

Calm matters in technology. It should be one of the main design goals. In the 1980s, the researchers Mark Weiser and John Seely Brown at Xerox PARC invented the modern world or, to be precise, the technology-determined historical era in which we’re living today, the ‘era of ubiquitous computing’ (UC). Wifi, touchscreens, tablet computers, wearables, apps, communications any time and any place, the gadgets and systems that we now take for granted in our daily lives, are part of the ceaseless expansion of UC into humanity everywhere and all the time. Ubiquitous.

But the experience of videoconferencing that we have today wasn’t what Weiser and Seely Brown imagined. Along with their vision for the UC era (which they dated, optimistically, between 2005 and 2020), they proposed a qualitative measure of its success: calm technology – the extent to which a technology works so well, fits so perfectly with human needs and habits, that it disappears from view. We stop noticing it, and are able to focus on achieving our goals. In 2007, Apple released the first iPhone, designed to be both calm and ubiquitous. More than a product, right from the start it was a sophisticated platform with a language and culture of its own, carefully designed to aid an evolution in being human. Packed with features and potential, but no printed manual, it was transformative. Years earlier, Douglas Adams in The Salmon of Doubt (posthumous, 2002) identified a key design goal necessary for technology to become ubiquitous:

We are stuck with technology when what we really want is stuff that works. How do you recognise something that is still technology? A good clue is if it comes with a manual.

And yet today, well into the era of ubiquitous computing, the experience of videoconferencing can be utterly horrible, regardless of the software used. It is definitely still just ‘technology’, as defined by Adams. Personally, I would read the fucking manual (or, RTFM, as they say in IT support) but I’ve got a Mongolian professor on the line, a class of Australian students, and a team of researchers spread across England, all impatiently waiting for me to coordinate our meeting, with the right people speaking at the right time, working together on shared screens and simultaneously editing files in multiple time zones. All of that despite continual network issues and the crazy cultural complexity of it all. Entangled, but not in a way in which we can follow the threads and make progress together.

To ease the craziness, I plan sessions carefully. Provide files, agendas and instructions in advance. Schedule plenty of pauses. But that all requires cooperation to work. And that’s not easy to achieve in these conditions. It gets easier each time we return for more. The trick is to maintain regularity in the setup of the virtual environment, and eventually it will feel a bit like entering a familiar space. But we’re not there yet.

So am I stressed? Me? Discombobulation is a good word for it. My head hurts and my eyes are swirling. I think I’ll just go for a ride on my motorcycle instead. That might sound like a ‘cop out’ – reckless escapism. But there’s more to it than that. In an interview in 2014, Seely Brown described how he and Weiser were inspired to develop the idea of calm technology by reflecting on riding their motorcycles (modern BMW models with electronic rider aids). Since then, motorcycles have incorporated ever more sophisticated technologies for perfect engine management, braking, suspension, lighting, rider comfort and navigation. Almost every aspect of the experience is touched by hi-tech wizardry. But it’s not obtrusive. It can’t ever be obtrusive. Not only would obtrusive tech interrupt the experience of riding – the ‘flow state’ as the psychologist Mihaly Csikszentmihalyi calls it – but information overload and unexpected bike behaviours can kill.

For example, variable valve timing (VVT) is used to compensate for the lean-burn requirements of emissions regulations. The engine-management system should maintain a smooth torque curve. But if implemented badly, VVT can result in sudden drops or surges in power. When riding around a bend, a change in power causes the bike to change direction. That’s not good. It is a state of what cognitive scientists call ‘volatility’ (see this excellent Aeon essay by Clark et al): the unexpected event demands a response to correct its impact, but with little time and information available, the response often makes things worse (using the front brake usually causes a crash). The great thing about motorcycling is the freedom it gives to continually and effortlessly manage ‘expected uncertainties’ (the continual flow of variations in the environment that are dealt with by well-trained habit), and to safely open up the rider to exciting new ‘unexpected uncertainties’ (new challenges in the environment demanding conscious attention and innovation). Weaving through time and space.

We can apply that lesson to other technologies we use to navigate our experiences and achieve our goals. Again, I think about my children and how their game interfaces give them a broader range of immediately accessible parameters to control. More like motorcycling. Perhaps we can enrich videoconferencing using games console-style controls, rather than having to pick our way through complicated on-screen interfaces? We need to humanise the interface, make it intuitive and smoothly progressive. Build in tactile feedback. I want to be able to subtly signal an emotion and its level to other participants, and modulate my signals with precision. As of late 2020, this is starting to happen in emerging virtual reality (VR)-based conferencing tools. That might be a key paradigm shift. We’ll see.

There’s something plain weird about a wall of faces staring out of the screen. That’s a blocker to empathy

Returning from that short road trip, back to videoconferencing, we can see just how far it is from being a calm tech. It’s just not the smooth ride that we need. If motorcycles were designed as badly as videoconferencing tools, there would be a lot of dead motorcyclists.

Let’s get back to the experience, empathising with people struggling with technology, to spark insights into better design. My attempts at humour and personal engagement, always difficult when mediated by tech, have failed miserably. Ordinarily, humour is an essential means for building empathy – the basis for understanding each other and enabling social action. But it’s not really working in this environment. There’s something just plain weird about a wall of faces all staring in the same direction out of the screen. That’s a blocker to empathy. Ideally, I want people to interact with each other. In a real physical space, they would be turning towards each other, making eye contact, interacting with varying degrees of subtlety. In videoconference tech, this just isn’t possible. They don’t even stay in the same place on the screen. It’s confusing, and does nothing to make conversation flow or to facilitate a sense of community. I don’t think they are even sympathising with my plight! They don’t even seem to be especially angry. I wouldn’t mind if they were swearing about it all. Something. Anything. This all feels just so inhuman.

But what if we don’t start with that big wall of faces, the crowd? Think about walking into a lecture theatre. Often, we bridge the transition into the space by chatting with one or a small group of people. Lecture theatres can seem inhuman, and perhaps that’s how we humanise them. I’ve noticed that people who enter the virtual room early and engage in this chat are more engaged and happier once the main session starts. They humanise it. People already use one-to-one videoconferencing well (on Skype and FaceTime), it seems to be a more natural kind of engagement, so perhaps we can piggyback on that experience. Why not get videoconference participants to do that as well? Start off participants chatting together in pairs or small groups, before moving into bigger groups. See how the bonds made in the personal exchange aid interaction in the full conference room.

Looking back on my experience of videoconferencing, I still get an odd emotional pain. The feeling is a kind of shame. Not so much for my own wooden performance and the failure of the technology. But rather a feeling that we have all lost a bit of our humanity through it. My interest in these technologies is ethically motivated. I am not at all happy with the banal dehumanisation that results from bad videoconferencing experiences. If, for example, students and teachers can’t express their humanity in education, through its technologies, then we’re just not doing it right.

However, I’d like to think that this exploration of videoconferencing in contrast with other more humane experiences has provided some hope and indications of the way to go. We’ve seen how understanding and designing for entanglement helps. We’ve compared videoconferencing with situations in which we feel comfortable and capable of managing the unexpected, and identified simple ways to improve it. And we’ve learned from the subtle techniques that are essential for success in everyday life. In some cases, we can make small changes to how we use existing tech. We’ve also identified ways in which the technology might be improved. And in other cases, we might use a different platform (for example, VR). That’s how designing works: incremental improvements based on insights drawn from experience. Let’s be optimistic, and keep designing to humanise tech, and using tech to learn about being better humans.