Microsoft’s new virtual auditoriums look silly—but fight Zoom fatigue

Based on research into how humans like to interact, Microsoft Teams’ Together mode restores some of the social cues we’ve been missing during the pandemic.

Microsoft’s new virtual auditoriums look silly—but fight Zoom fatigue
[Animation: courtesy of Microsoft]

Microsoft believes it has an answer to Zoom fatigue, and it comes in the form of superimposing people onto a virtual auditorium in Microsoft Teams.


The new feature, called Together mode, acts as an alternative view for video calls, which you access through a menu button in the chat window. Once you select it, Teams will automatically assign a virtual seat to everyone in the call, so you can mimic a high five with the person next to you or point to someone a couple rows away. The feature, which is part of a broader set of additions headed to Teams, is rolling out now and should be widely available next month. (Teams also offers conventional, Zoom-esque videoconferencing.)

Although the sight of several bodies pasted onto an illustrated auditorium can seem silly at first, Microsoft is dead serious about Together mode. Compared to the conventional grid layout of most video calls, Microsoft says people feel more at ease and are less likely to gaze at themselves when their faces are scattered across a virtual space.

Together mode can cram dozens of colleagues into one on-screen auditorium. [Image: courtesy of Microsoft]
“As miserable as the quarantine is, and as difficult as the road forward is, it’s been at least a chance to do something that’s somewhat useful and makes us a little less miserable,” says Jaron Lanier, the Microsoft Research scientist who guided Together mode’s development.

Why Together mode works

Together mode seems simplistic and maybe even a bit crude on the surface. But Lanier, who is well known as a virtual reality pioneer and the author of several books about tech’s societal impacts, often uses the word “subtle” to describe its benefits. Behind the basic presentation are a lot of big ideas about how humans like to interact and how the current conventions of video chat fall short.

“In a sense, it’s just a simple design strategy,” Lanier says. “In another sense  it’s a design strategy that benefits from many years of studying mutual perception, particularly in virtual reality.”


When people are in this design, the brain is less sensitive to the errors of false eye contact, or missed eye contact, or gaze mismatch.”

Jaron Lanier, Microsoft Research
A basic example: It’s hard to make eye contact on video calls because our natural tendency is to look at who’s talking instead of at the camera. Together mode isn’t exactly a solution—it’s not doing any kind of eye tracking or eye correction—but it does mask the problem. Because the shared space gives the impression that people are just looking around the room, the brain gets tricked into not caring as much about direct eye contact.

“What we observed is that when people are in this design, the brain is less sensitive to the errors of false eye contact, or missed eye contact, or gaze mismatch,” Lanier says.

Together mode also spares people from the feeling of being constantly stared at by a room full of people. Jeremy Bailenson, the director of Stanford’s Virtual Human Interaction Lab, says this is a less appreciated aspect of Zoom fatigue, but one that’s just as exhausting as the lack of direct eye contact. Grounding people in a shared virtual space helps solve the problem, because you can see everyone’s heads and eyes gravitate toward whoever’s talking.

Along with its auditorium, Together mode will eventually let you hang out with coworkers at a virtual coffee bar. [Image: courtesy of Microsoft]
“You get a little bit of a break,” says Bailenson, who informally consulted with Lanier on the Together mode feature. “If someone in the top left is talking, and all the heads are kind of pivoted over there, all of a sudden you’re not being stared at as a listener.”

Even the simple act of reducing peoples’ head sizes makes a big difference, Bailenson says. In a regular video chat, our heads appear a lot larger than they would in a real-world conversation. Together mode compensates for this by putting everyone in the space of a larger room.


“From an evolutionary standpoint, when somebody’s really close to you and staring at you, it’s a very intense, arousing event,” Bailenson says. “What we’re doing in an unanticipated way with the default settings on videoconferences is we’re activating this fight-or-flight reflex.”

These benefits aren’t just theoretical. Last month, Microsoft monitored users’ brain activity as they switched between Together mode and conventional grid-based views in Teams calls. While the results haven’t been peer-reviewed, Microsoft says participants exerted less mental effort while using Together mode, suggesting that it can help with meeting fatigue, especially during the pandemic.

“There are a lot of little elements that I think contribute to the psychology of ease,” Lanier says. In Together mode, “people definitely look at each other more than themselves, they definitely keep the camera on longer. They’re more in the meeting than in grid mode.”

Recreating crowds

Lanier, who also composes music and plays a wide range of musical instruments, says the idea for Together mode arose when one of his gigs got canceled during the coronavirus pandemic. He’d been planning to play in a band for a comedian with a late-night TV show—Lanier declines to say who it is—but didn’t end up doing so after the show stopped recording in front of studio audiences.

Even in Together mode’s current form, Lanier and others at Microsoft stress that it’s a first step.

This got Lanier thinking about a separate concern the comedian had: How would the show work without responses from a live crowd? Lanier reached out to the Microsoft Teams group, which had already been playing around with ways to combine videos of different people, and asked it to throw together a virtual audience prototype.


The virtual-audience idea never went anywhere, Lanier says, partly because Microsoft wanted to keep its people-combining technology under wraps, and partly because comedians found other ways to adapt their work that didn’t involve experimental technology. Still, Lanier saw the potential to use the concept elsewhere.

“I was looking at this thing and playing with it, and I realized, ‘You know what, if we tweak this thing and work with it, we can bring up some of the advantages we’ve learned about how to improve visual communications,” Lanier says.

Even in Together mode’s current form, Lanier and others at Microsoft stress that it’s a first step. Jeff Teper, the head of Microsoft 365 and Teams engineering, says the company would like to experiment with different kinds of room layouts beyond the virtual auditorium that it will offer at launch.

Together mode is available to Teams users in school as well as business customers. [Image: courtesy of Microsoft]
“We’ll see people make the space their own, just like they do the workplace, for the type of social dynamics that may exist,” he says. “I think where this is headed is a lot more flexibility and customization.”

Microsoft also notes that Together mode won’t make sense for every kind of meeting. Lanier says he starts seeing the benefits with at least a few people on a call, and finds them most noticeable when five or more people are participating. It’s also going to be impractical when people need to present slides or work on a whiteboard. Bailenson, the Stanford professor, says he’ll most likely use it at the start of a video call for team-building purposes, but not for hours on end.


Still, Lanier hopes Together mode will establish itself as a video chat convention, not unlike how Zoom’s Gallery View became a table-stakes feature for videoconferencing amid the pandemic.

“Personally, I think this thing’s going to hit the culture, and a lot of interesting stuff’s going to happen,” he says.