Being a UI or graphic designer who can also code gives you a leg up in today’s world of digital design. For everyone else, it’s a collaborative, somewhat backwards process: A UI designer mocks up how the interface will look, and a front-end developer takes that design and translates it into code. Only then can you get on to the more interesting work of actually building out the site and refining the features.
Translating design into code can be tedious and not particularly thought-provoking—which also happens to be the criteria that makes a task ripe for automation. The Copenhagen-based startup UIzard Technologies is already on it: The company has trained a neural network to take a screenshot of a graphic interface and translate it into lines of code, effectively eliminating that part of the web design process for developers. Impressively, the same model works across iOS, Android, and web-based interfaces, and at this early point in the research the algorithm works with 77% accuracy.
Last week, Tony Beltramelli, the founder and CEO of UIzard Technologies published a research paper on how the model, called Pix2Code works. The gist is this: Like all machine learning, the researchers had to train the model on examples of the task at hand. However, instead of generating images from images, or text from text, this algorithm needed to be able to take an image in the input and generate text (in this case, code) for the output. To do this, the researchers had to train it in three steps—first, with computer vision to understand the scene (screenshot) and the components (buttons, containers, etc.). Secondly, the model had to understand computer code, and be able to generate syntactically and semantically correct samples. The final challenge was to connect the two previous steps by taking the inferred scene and generate descriptions of them in text.
The model Beltramelli and his team created takes a screen grab of the UI design, assesses the picture—the various icons, features, and the layout—then generates lines of code based on what it sees. This video shows the technology in action:
In practice, Pix2Code would certainly save time for developers, who could input JPEGs of a designed interface and produce workable code that could then be manipulated and refined. It would also make it easier for UI or graphic designers with a very basic knowledge of code to tackle the entire website themselves.
On the other hand, it could also potentially make it much easier to copy another websites’ code—a problem that already bedevils some developers. While a collaborative ethos among programmers is apparent in sites like Github, some developers—particularly those working on websites for clients who want original websites—don’t want others to be able to crib their code.
Regardless, UIzard Technologies is continuing to refine the model, training it on more data to improve its accuracy. For Beltramelli, who recently completed his graduate studies on machine learning at both the IT University of Copenhagen and ETH Zurich, there seems to be little doubt that the Pix2Code will get there. “Considering a large number of websites already available online and the fact that new websites are created every day, the web could theoretically supply an unlimited amount of training data,” he writes in the research paper. “We extrapolate that deep learning used in this manner could eventually end the need for manually programmed GUIs.”
Pix2Code is the first app that UIzard has developed, and it’s still in the beta stages (you can sign up to try it here). The company’s hope is to help developers, designers, and startups create better apps and websites by eliminating the step of writing code as an early step in the development process, freeing up more time for prototyping, iterating, and eventually producing a better product. The initiative is the beginnings of the company’s mission to use AI to keep technology accessible even as it races ahead—allowing more people to take advantage of ever-more complex systems.