Four Google Brain research scientists–Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc Le–recently released the results of a fascinating project with two objectives. First, they wanted to create an AI system that can spawn new AI systems, which are more sophisticated than what humans can design. Second, they wanted to test this system to identify objects in real time with remarkable accuracy. They’ve done both–and their research has vast implications for everything from surveillance to self-driving cars.
Designing machine learning models is remarkably tedious. It requires significant time and expertise, so to speed things up, Google’s researchers created AutoML, a machine learning model designed to create other machine learning models using an approach called reinforcement learning. This method has a controller neural network that can create a “child” network to execute a specific task. In this case, the task was to recognize objects in a real-time video feed, like people, cars, traffic lights, handbags, or backpacks. The “child” model trains for the task and gets evaluated by the controller AI, which learns from the feedback and refines the child–a process that gets repeated thousands of times until the child model gets really good.
Self-driving cars are one obvious use of this architecture. You can imagine the system helping Google’s AVs identify traffic, pedestrians, and road hazards. NASNet could also be used in augmented reality to help apps interact with the environment in a faster, more accurate way that current computer vision solutions. But perhaps the most intriguing applications have yet to be identified. Google’s researchers decided to make NASNet public here (for image classification) and here (for object detection), so other scientists can make use of it.
Of course, automating automation raises some alarming questions. How do you ensure you aren’t building a biased system that then passes that bias onto another system? How do you ensure the systems are used ethically? I can imagine some dystopian applications, like automated surveillance, in which computers constantly analyze images to flag objects or activities that they consider suspicious. That could be a boon to public safety or it could be the makings of a police state. I can also imagine refining the system to recognize faces on the fly and follow anyone across a city.
NASNet can address “multitudes of computer vision problems we have not yet imagined,” the researchers write. For better or for worse.