In August, Box announced that it was bringing image recognition capabilities to the platform powered by Google Cloud Vision. Yesterday it took things a step further, with the announcement of Box skills that leverage machine learning tools from IBM, Google Cloud, and Microsoft Azure.
“It the culmination of a lot of the work that we’ve been doing within the machine learning and AI space,” says Aaron Levie, Box’s cofounder and CEO. “We’re seeing a significantly growing scale of data moving to the cloud and people wanting to work in new ways.”
There are currently more than 130 billion files stored on Box, stored by not only individuals but also by over 74,000 companies that range in size from smaller startups to Fortune 500 businesses. Levie says that his company–with an annual revenue of about $400 million and its sights set much higher–is now starting to look at ways that it can help structure all that data and allow businesses to access it in different and more valuable ways. Step one was August’s Google Vision announcement.
“It was really the start of a much broader technology platform we’ve been building out,” Levie says.
That platform is called Box Skills Kit. Levie says that it’s the framework that Google Vision was built on, but that when that feature rolled out in August, the company wasn’t quite ready to talk about it.
“The basic idea is how do we enable any third-party developer or platform to be able to plug into Box and bring their machine learning capabilities to our platform so we can better structure, organize, and make sense of the information and data in Box,” Levie says.
To start, Box is opening the platform up to Google, Microsoft, and IBM, which will be bringing some initial skills to Box around image, audio, and video intelligence. It will also be opening up the broader platform so anyone can build on top of those core skills.
One feature powered by Azure, for instance, will detect objects inside of videos and can do everything from audio transcription inside a video to detecting faces and allowing you to jump to parts of a video where specific people are talking.
With IBM Watson’s help, Box will be able to detect all the words being said in an audio file. That skill can be paired with another that’s able to do sentiment analysis, so businesses can do things like monitor call center employees and get a quick overview of what happened on a call without ever having to listen to it.
Image recognition within Box will be able to detect individual objects and concepts in image files, capture text through optical character recognition, and automatically add keyword labels to images.
“There’s just an unbelievable amount of information that companies are dealing with, and that’s only increasing at an even more rapid rate,” Levie says. “Anyone in the AI or machine learning space in the enterprise is sort of dealing with this juxtaposition of what the technology can do and what customers want and what customers need to automate.”
That means all the new features won’t be available for everyone on Box just yet. Customers can go to Box’s site and sign up to join the beta, but not everyone will get approved. Levie says the slow rollout is in part to make sure that the company understands the reason you’re trying to use the new skills, and make sure that there’s a good skill for your particular use case.
“We want to make sure that we can deliver the best possible experiences to our customers before we open this unto the entire world,” he says.
Pricing also hasn’t been determined yet, but may potentially be on a per-skill basis–one of a number of details still to be ironed out as the company looks to expand its offerings in AI.
“We know that the only way to actually apply structure and organize and be more efficient with our information is going to be through machine learning and AI,” Levie adds. “We’re at the beginning of a very long journey in this field.”