Moritz Stefaner has long been obsessed with food. As a designer, he has even used meals to visualize data about everything from ethnic diversity to scientific funding. So when he encountered an AI that generates realistic pictures from words, he went hog wild.
Stefaner typed words like Michelin star chef, deconstructed, and amuse-gueule into the generator, hoping to evoke the intricate plating of fine dining establishments like Eleven Madison Park or Alinea. And, suffice to say, his plan worked. The images he created are completely convincing plates that you could imagine being served at any 3-Michelin-starred restaurant. That is, until you look a bit closer and you realize that the individual components on the dish often don’t even exist in real life.
Squinting, I see the bones of a soft-boiled egg, seaweed, microgreens, sauces, and gels. In one frame, I swear I see the same candied moss I actually ate at the Chicago fine dining establishment Elizabeth. In another, I see a chocolate dessert sitting on a mound of coffee grounds—a stone’s throw from a dish I ate at Dominique Ansel Kitchen. But for the most part, it’s superbly convincing fiction—and what happens when an AI’s style vastly outpaces its substance.
“The most surprising thing to me was how well the system deals with really poetic descriptions,” writes Stefaner via email. “It goes way beyond just capturing objects in certain styles, towards capturing a whole vibe.”
The AI was built by Midjourney, a self-ascribed “research lab” focused on “expanding the imaginative powers of the human species.” Much like GLIDE, Disco Diffusion, and Dall-E, Midjourney’s AI model generates images from words. But Midjourney makes the process particularly simple. You don’t have to understand code or set up anything special to use this AI. Instead, it is hosted on a private Discord, so creating an image is literally as easy as typing it.
“The input to these networks are short texts called prompts. The art of the prompt is really becoming a key skill in interacting with these models. Similar to learning how to nudge a search engine to surface the right results, the ‘prompt artist’ learns to use the right combination of words to achieve the desired effects,” Stefaner says. “For instance, one can add ‘drawn by Picasso’ or ‘in the style of Keith Haring’ to evoke style modifiers that mimic an artist’s style.”
Stefaner focused on words like fine dining, and stylistically he tagged many of his prompts with “dof,” which stands for “depth of field,” and refers to the partially in-focus images that appear to be shot by a traditional camera and are a hallmark of fine dining photography.
No doubt, the way these images are framed—the angles at which the camera seems to be taking them—helps them seem convincing. That’s key because most of these foods are at least a little alien; they are almost-foods, if you will. Pasta dishes look more like thinly sliced banana peels. Fish looks like salmon, if its skin were marbled into its flesh. These oddities aren’t always as gross as they might sound. One anonymous “fine dining” plate looks like a cross between prawn shells and flowers. It’s downright beautiful, and just the sort of meticulous surprise you hope to encounter when dropping hundreds of dollars on a tasting menu.
I wish I could say the same about what appears to be sea scallops. Seared on top, they melt like a Salvador Dalí painting into the plate. Are they made of ice cream? Might they be a foam? My brain tries to make sense of it all until I remember, there’s no sense to be made. I’m looking at an AI hallucination.
“It’s almost like an alien life form observed us and tried to imitate and blend in the best it could, without really understanding what is going on,” Stefaner says. “This ‘strangely familiar unfamiliar’ feeling is a bit unsettling, but can also really trigger creativity. We are pattern-seeking animals, always searching for meaning, so we really try to figure out what these dishes could be, what they could taste like, even though they don’t quite make sense to us.”
These hallucinations, of course, are trained into the software, which was fed countless labeled images to understand how to draw the objects. And there is no better window into the AI’s superficial logic than in Mortiz’s “fine dining high end Michelin star closeup burger”—on top, it’s a pile of rare ground beef, capped with a shiny brioche bun. But on the bottom, where the other bun should be? That’s a coral-like pile of something vaguely edible. In other words, the burger starts at Red Robin and ends with Noma. And while it’s funny, this image also demonstrates how little these AI models understand about the content they generate.
In any case, Stefaner’s images are captivating to behold. They also push us to ask, What’s next? Thus far, we’ve seen art imitate life. But next, we might see life imitate art.
“I’d love to do creative sessions with ambitious chefs to generate inspiring images, based on new prompts (or their existing menus!) and then see if we can together reverse-engineer them into successful dishes,” Stefaner says. “It’s a new type of agent you can inject in your design process to generate completely new, oblique ideas.”