It’s not just a problem in Photoshop. Even iOS’s photo editor is a convoluted mess of UI. Sliders. Icons with no apparent meaning. Just fixing brightness and contrast can be a challenge. So what happens when I want to contour my toddler’s cheekbones?
Adobe Research is demoing the early stages of a promising solution. It’s called an Interactive Agent for Photo Editing. And rather than finding the right button, the user can just say what they’d like to do to a photo–and theoretically, that happens.
Adobe’s demo begins with a user’s request: “I’d like to reframe this picture.” The program attempts a rectangular crop. “Make it a square,” he counters. The video doesn’t go much further than that, mostly showing off the ability to undo an action with a voice command. And in this sense, it’s a lot less impressive than Adobe’s recent tech demos like cloning actual voices. Speech recognition work by Google, Apple, and Microsoft puts this to shame.
All the same, there is something liberating about the idea. You can imagine someone desperately cursing at their tablet, “JUST MAKE IT A SQUARE,” teasing the true potential of Adobe’s creation. Natural language has two big benefits to programs like Photoshop. First, it allows users to say all sorts of commands without knowing terminology like “Rectangular Marquee Tool,” making the system’s surface functions more intuitive than they might be in an icon-based UI.
But the second benefit is even greater: That speech could be a bridge between the user and more complex processes within the software. It would allow the amateur user to say something like “make the faces brighter” or “can you remove that weird hand ruining the silhouette” to trigger automated fixes that mimic the step-by-step process a pro Photoshop user might carry out. And if the final product is wrong, the user could simply say, “try again.” In this case, Adobe isn’t just making esoteric controls more accessible; it’s bypassing editing UI altogether. The software simply does the work for the user.
Instagram takes a version of this approach already, albeit without voice, using its preset filters. These filters aren’t just there for style; they’re a list of preset guesses that can rescue the gamut of poorly-lit, oddly-colored photos that come off our smartphones. But a voice layer, coupled with a bit of AI? It could make anyone photo technician.