How AI is helping Amazon become a trillion-dollar company

An exclusive look at how AI shapes every aspect of Amazon’s business, from its warehouses full of products to your Echo smart speaker.

How AI is helping Amazon become a trillion-dollar company
[Photos: courtesy of Amazon; Animation: Daniel Terdiman]

Swami Sivasubramanian lives in a wooded area in the Seattle suburbs that’s a favorite with opportunistic local bears. From time to time, usually on garbage night, the animals wander into Sivasubramanian’s backyard to pillage his trash. But try as they might, he and his family had never managed to spot the intruders.


“My wife really wanted to see these bears in action,” says Sivasubramanian, Amazon’s VP of machine learning. “She will always try to stay up looking for bears to visit, and she wants me to give her company.”

Sivasubramanian cops to being kind of lazy on that front. But as a technologist, he’s much more proactive. He founded his solution in DeepLens, a new video camera system from Amazon Web Services that lets anyone with programming skills employ deep learning to automate various tasks. DeepLens let him placate his wife by building “a machine learning model that actually detects a bear and sends a text to her phone, so that she can wake up, saying, ‘Hey, a bear is right there digging up the trash,’ ” he says.

DeepLens can perform plenty of other machine-vision tricks, such as figuring out if food is a hot dog or not a hot dog (yes, that’s a Silicon Valley reference). It can also transfer an artistic style from one image to an entire video sequence. It’s just one of a myriad of ways that Amazon is utilizing AI and machine learning across its many businesses, both for carrying out internal processes and for improving customers’ experience.


Since its earliest days, Amazon has used AI to come up with product recommendations based on what users already said they liked. The algorithms behind those systems have been tweaked again and again over the years. These days, thanks to machine learning, the recommendations have gotten more dynamic, says Jeff Wilke, the CEO of Amazon’s worldwide consumer division. “Say there’s a new piece of fashion that comes into the fall season,” he explains, “In the past it might take longer for the algorithms that we use to realize that people who bought these shoes also bought this top. And with some of the new techniques we can detect those things earlier, those correlations. And then surface the new top earlier in the season.”

The Echo Dot–and every Alexa-powered device–is infused with Amazon AI. [Photo: courtesy of Amazon]
Other Amazon AI and machine-learning efforts power the Alexa voice assistant, give users of Amazon Web Services access to cloud-based tools, allow shoppers to grab items and walk immediately out of Amazon Go stores, guide robots carrying shelves full of products directly to fulfillment-center workers, and much more. And while the technology is vital to Amazon across most of its businesses, the range of its applications is still stunning. It’s also a key reason why the company (briefly) hit $1 trillion in market cap, and stands every chance of getting back there for the long haul.

A company-wide mantra at Amazon is that every day is “Day One,” a humble contention that for all Jeff Bezos’s brainchild has accomplished, it’s just getting started. When it comes to AI and machine learning, Sivasubramanian doesn’t just pull out the standard “Day One” reference. He jokes that “it’s Day One, but it’s so early that we just woke up and haven’t even had a cup of coffee yet.”


Dance of the robots

Deep inside Amazon’s 855,000-square-foot fulfillment center in Kent, Washington, 18 miles south of Seattle, a bunch of orange Amazon robots are doing a dance. Balanced on top of each of the orange machines is a yellow pod with nine rows of product-packed shelves on each of four sides. Powered by AI, each of the robots automatically sprang into action when someone somewhere in the Pacific Northwest purchased something on, and each is now autonomously maneuvering its way around the others in a bid to get to a station at the edge of the fenced-off robotic field where a worker will pluck the item in question and put it on a conveyor belt toward another worker who will box it up.

At the scale that Amazon processes orders, peak efficiency is essential. Magnified over millions upon millions of orders a year, even a second or two saved per order makes a huge bottom-line difference.

For some time, Amazon has used machine learning in its fulfillment centers “to improve our ability to predict what customers are ordering and place it in the right place,” says Wilke, “And also to improve the efficiency and speed with which we get things to consumers.”


It might not seem all that sexy, but a recent AI-based innovation that allows workers in those fulfillment centers to skip one manual item scanning step per order is a big win for the company. The new technique is being applied to Amazon’s long-standing stowing process, which lets workers store items that have arrived from distributors and manufacturers anywhere on a warehouse’s shelves–so long as their location is recorded in a computer so that they can be found again on the first try. The method which has been in use has involved workers grabbing an item out of a box, using a bar-code scanner to scan it, placing it on a shelf, and then scanning the shelf. The dual scanning associates the item with its location.

Now, thanks to a combination of advanced computer vision and machine-learning technology, workers will be able to simply pick up an item in both hands, slide it under a scanner mounted nearby and place it in a bin. The system is smart enough to recognize where the item was placed and record it for future reference, without the worker having to scan the bin.

Brad Porter, Amazon Robotics’ VP of engineering at Amazon Robotics, says that freeing up the hand that would have been used to wield a bar-code scanner is a big boon to efficiency. “After about five minutes of doing it myself, I realized that I could pick up five or six small items… hold them in my left hand, grab one, scan it, put it in, grab one, scan it, put it in,” he says. “It’s super natural, super easy.”


Robots at an Amazon fulfillment center. [Photo: courtesy of Amazon]
The new system, which took about 18 months to develop, uses computer vision and machine learning algorithms to evaluate how a worker is touching items and determine when those items have been placed in a bin. Porter characterized the algorithms as among the “more sophisticated” news Amazon is using, given the need to tell whether a worker is holding up an item alongside a bin or actually placing it inside one. The system has to be able to work in different lighting conditions, and regardless of how full the bins are–something that can vary dramatically depending on time of year.

In recent weeks, Amazon has turned the new system on at its Milwaukee fulfillment center and is getting ready to do the same in about 10 other centers. Given that any changed methods must not introduce inefficiencies in Amazon’s fulfillment centers without a massive negative impact, Porter’s team had to be sure the new innovation was ready. They asked, “Are we going to turn the [system] on for peak [holiday season] this year,” he says, “and we pretty much made the decision that we’re ready to go.”

It’s not clear when–or even if–Amazon will roll out the new system at all of its fulfillment centers. Regardless, Porter is already thinking about how to improve it. That boils down to leveraging advances in camera technology and machine-vision processing speed. He imagines upgrading the system with more cameras involved, making it possible to recognize bar codes on a package without the worker even having to orient it towards a scanner. It might only save half a second per item, but at Amazon’s scale, that makes it very sexy indeed.


Grab and go

Given that the heart of the new fulfillment center system involves using cameras and AI software to detect someone holding an item and placing it on a shelf, you might think that the same technology is in use at Amazon Go, Amazon’s automated grocery stores that allow customers to walk in, grab what they want, and simply walk out the door, with everything being charged to their account automatically.

Not so, says Porter. Although there is likely some consultation going on between AI scientists across the company, Go’s hardware, which includes color and depth cameras, as well as weight sensors and algorithms, was independently developed. It reflects five years of work developing systems capable of tracking people’s handling items in a wide variety of sizes, shapes, and colors in complex environments like crowded grocery stores.

As of now, there are only four Amazon Go outlets–three in Seattle and another in Chicago, with more on the way. But they are able to handle a steady flow of customers who can scan their phone upon entry, shop as much or as little as they want, pick thing off of shelves and put them back, and accurately track what they end up leaving with, regardless of numerous potential pitfalls along the way.


Amazon Go only looks like a typical small grocery on the outside. [Photo: courtesy of Amazon]
Dilip Kumar, the vice president of Amazon Go, says that the very act of customers picking up an item presents a challenge to the system, since it blocks the cameras’ view of an item. Go’s systems must be capable of tracking what each customer in the store has picked up–possibly including multiple identical items–regardless of how crowded the store is and even if two people dressed identically are standing side by side and reaching across each other for purchases. “You could be picking up an item here, [or] I could be picking an item there. We still need to be able to associate my pick to me and your pick to you,” Kumar says. “The challenge with all of this is not just being able to build a sensor, but also dealing with varying lighting conditions. You can look at color temperature. Things vary. What’s pink is not always pink throughout the day.”

To deal with all of this, Kumar’s team designed algorithms that analyze what the cameras are seeing and look for interactions people have with products. In order to work, they have to be able to determine who took what at “the moment of truth” as an item is removed from a shelf.

Kumar won’t say how accurate Go’s systems are, but it’s clear the company wouldn’t make them available to the public if they were prone to high error rates. Fo over a year, the original Seattle store–which is on the ground floor of the headquarters building in which Amazon CEO Jeff Bezos works–was accessible only to employees as the company fine-tuned the system.


Next up for the Amazon Go technology, Kumar says, is to boost its algorithms so that they’re more powerful “per unit of compute” and to take advantage of cheaper sensors. Combine those two factors and Go’s systems could well be capable of more quickly identifying new items in stores without having to train the algorithms to recognize them. That’s important, he points out, when between 20% and 30% of items are new at any given time.

Asked if Amazon plans on porting the Go platform to its Whole Foods empire, Wilke says that’s not likely. Rather, he sees Go as just one of many ways–including Amazon Pantry, Amazon Fresh, Whole Foods, and others–of getting groceries and other items to customers. Ultimately, Wilke says, machine learning is an “advancing” technology, “which allows us to make some of these experiences better.” He adds that “real estate is hard,” and that Amazon doesn’t have long-standing expertise in it. But if a recent story by Bloomberg’s Spencer Soper is correct–he reported that Amazon is considering opening 3,000 Amazon Go stores over the next few years–the company isn’t fazed by the prospect of learning as it goes.

Alexa skills for all

Odds are that when most people think of Amazon and AI, they think of the company’s digital assistant, Alexa. To date, people have bought millions of Alexa-powered Echo devices, and third-party developers have built more than 45,000 skills–essentially voice-powered apps–that can do everything from help with recipes to play family games to read the news.


Along with cranking out its own Alexa gizmos at a furious rate, Amazon has been working on helping third-party hardware manufacturers integrate Alexa directly into their products. Known as Alexa Voice Service, the initiative has spawned about 100 products so far from companies like Sonos, Ecobee, Sony, Lenovo, and others. Rabuchin explains that Alexa Voice Service is essentially a set of APIs in the cloud that enable hardware makers to utilize Alexa. Amazon makes its front-end audio algorithms available to the third parties, as well as guidance for building Alexa-powered devices.

Amazon is also working with institutions to let them create customizable skills for Echo devices placed in college dorms or hotel rooms. As an example, Steve Rabuchin, VP for Alexa voice services and Alexa skills kit, recalls staying in a Marriott hotel and being able to get Alexa to turn the lights on and off, turn on the TV, change the channel, and ask where the gym was located.

The next frontier for Alexa is letting consumers create their own custom skills. In the past, that required some basic software development knowledge. But Amazon wanted to democratize the Alexa skills creation process, so it launched what it calls Blueprints–a template-based Alexa skills creation tool that just about anyone can figure out.

Blueprints let anyone teach Alexa new tricks, no coding required.

Creating a skill with Blueprint is as easy as filling in a few fields and hitting save. And while the skills generally won’t be as sophisticated as ones built by professional developers, and can’t be made publicly available, they do allow for custom skills nearly any Alexa user to leverage AI for some highly personal purposes–such as giving instructions to a housesitter or stepping through a workout regimen.

Amazon’s Echo Plus [Photo: courtesy of Amazon]

Amazon AI everywhere

One of the primary drivers of Amazon’s rise to a near-trillion-dollar company has been Amazon Web Services, its massive cloud-based storage and server business. AWS has become a cloud standard for companies and developers wanting access to the same kind of AI and machine learning technology that powers Amazon offerings suxch as Alexa, Amazon Go, Amazon Prime Video’s X-Ray feature, estimates for product delivery times on, and more. “Our mission in AWS,” says Sivasubramanian, VP of Amazon machine learning, “is to put those machine learning capabilities in the hands of every developer and data scientist.”

Sivasubramanian says that there’s excitement about machine learning’s potential in nearly every sector of the economy. But while executives at countless companies see how it can help their businesses, “it’s still in its infancy. [Those executives] look to us and say, ‘How can you actually help us take advantage of these machine learning capabilities to transform our customer experience?’ ”


To date, Sivasubramanian says, there are tens of thousands of customers using AWS-based machine learning services across sectors including retail, real estate, fashion, entertainment, health care, and others. Those customers have a variety of levels of AI competence. Some are what Sivasubramanian calls experts–people with PhDs in machine learning–while others are simply app developers. Amazon has tailored its AI and machine learning offerings to match both sorts of customers’ needs.

Some of those users have deep experience and the ability to build their own machine learning models; others just want to take advantage of models that have been created for them. That’s why Amazon built SageMaker, an end-to-end machine learning service meant to help developers build and train machine learning models and run them either in the cloud or on devices such as smartphones.

Sivasubramanian ticks off a wide variety of examples of corporate customers using AWS’s AI and machine learning services. Among them include Intuit which is using SageMaker to build fraud-detection models; Grammarly, which predicts what a user is writing and what corrections are required; CSPAN, which analyzes thousands of hours of video in order to recognize celebrities and specific politicians, as well as to double the number of videos it has indexed; DuoLingo, which is using Amazon’s Polly text-to-speech service to generate individual language learning sessions; Liberty Mutual, which is using Amazon’s conversational API as a service, Lex, to build a chatbot that enables the insurance company to handle many users’ questions; and the NFL, which is analyzing plays in order to predict what the next one will be.


He says that usage of AWS’s machine learning tools has grown 250% over the last year, and that since last November, AWS added more than 100 new features or services to its machine learning portfolio.

One of them is DeepLens. Designed so that developers can build and fully train a machine learning model within 10 minutes of unboxing, the camera system is already being used in many ways Amazon never imagined.

Of course, among those unorthodox applications is the project Sivasubramanian built to satisfy his wife’s request. And what he learned was that DeepLens was smarter than he even realized. “Initially, I had it notify for any animal, including my dog,” he says. “But this is the fun of machine learning: you constantly tune it to make sure you exclude things that are false positives, to make sure it gets more and more accurate. It’s an ongoing project so [my family] can have the best bear detector in the world.”


About the author

Daniel Terdiman is a San Francisco-based technology journalist with nearly 20 years of experience. A veteran of CNET and VentureBeat, Daniel has also written for Wired, The New York Times, Time, and many other publications


#FCFestival returns to NYC this September! Get your tickets today!