We’ve always known that video doorbells could record video of people within its viewfinder range. But, it turns out that they can also record audio, too.
A recent investigation by Consumer Reports found that the Amazon Ring Video Doorbell 3 and the Arlo Ultra video doorbells can record audio as well as video to a range of between 20 and 25 feet—a revelation that means audio surveillance can occur even when we can’t see the video cameras. This makes it hard for us to know when we are under surveillance, what data is being collected, and if and how it is being used.
We have become conditioned to believe that if we can see hardware and indicator lights on video cameras, then we can determine if we are being recorded. However, with audio-recording video doorbells, the indicator lights are small—if present at all—and do not easily indicate status as to whether video, audio, or both are being recorded. Furthermore, if we are within the audio recording range of 20-25 feet, we could be “out of sight” from a video doorbell and still be recorded. In effect, this means we can’t ever be sure of our own privacy within our communities, a fact that carries huge ethical questions (not to mention raises some legal red flags, including whether intercepting the communications of an unsuspecting passerby violates various federal and state laws against wiretapping).
Because “security cameras” initially were expensive, they were first installed and monitored by governments, corporations, and retail establishments. As technological progress lowered costs, security cameras could be found in wealthier residential neighborhoods. The advent of video doorbells made home surveillance accessible and affordable to millions, and removed the need for homeowners to purchase and install cumbersome multi-camera external units for their monitoring needs.
While this “privately run neighborhood surveillance program” is concerning in itself, what hasn’t been fully publicized is the extent to which these devices record audio, and why that matters. In the advertising and specifications for the Amazon Ring 3 and the Arlo Ultra video doorbells, the feature of “audio recording” isn’t specifically identified to consumers. The materials use words like “talk” and “hear,” but not necessarily “record.” With the Ring 3 in particular, its capability to record audio at a distance of up to 25 feet isn’t directly called out as a feature in Amazon’s advertising, but instead appears on the back-end as a “Ring Skill” in the Alexa app. Once someone purchases a Ring doorbell, the Ring Skill can be configured to listen, enable chat, and take commands via the doorbell. In its advertising, the Arlo lists “2 way audio” and “microphone” as a feature in the Ultra specifications, but also does not specify distance or recording capabilities.
Many of us have objected to private video “dossiers” being collected on us by our neighbors—without our consent—as we pass by these homes while walking on public sidewalks. It has been of concern for people to realize the extent to which the police have supported these home surveillance efforts through Amazon’s “Ring/Police partnerships,” with some officers encouraging citizens to install Ring doorbells to increase their defenses against intruders. Ring has also appeared to semi-automate (or at least expedite) the process for police to solicit recordings in the event of a neighborhood crime (or accusation of a crime). And once a citizen shares footage with the police, the police can in fact share that footage with others.
Another issue with video doorbell surveillance is that only fragments of information will be collected. Imagine people walking down a street by a row of houses, and several, but not all of them have video doorbells that record audio. Pieces of their conversation will be recorded and collected by different houses as they pass by, and will by default be uploaded to servers, unless the homeowner changes setting preferences, a state which is also unknown to pedestrians. In some ways, this patchwork of audio could provide a bit of a privacy safety-net, because only hearing pieces of a conversation can obscure meaning. However, multiple homes could be recording (and sharing) the same or different pieces of the conversation, and attempting to infer meaning of these pieces in different ways. Also, if pedestrians were to approach the home with the video doorbell to ask about being recorded, they could be being recorded, which further undermines their privacy.
With storage potentially happening on remote servers, the “privacy protection” of recorded random conversation snippets could backfire. Once these pieces are in those back-end databases, they may remain for as long as the vendor cares to keep the data, and there seems to be no accessible policy or way for those who were recorded without their permission to request that it be deleted—if they are even able to determine whether they were recorded in the first place.
If someone is recognized and/or known to their neighbors, a snippet of a conversation without relevant context could generate misunderstandings, causing assumptions and rumors. For example, people walking down the street could be discussing an episode of a crime TV show that is unknown to one or more neighbors recording them. If those neighbors (or eventually back-end algorithms) misunderstood what was said, they might contact authorities, creating the potential for all sorts of unforeseen problems.
There is little we can do about changing our neighbors’ doorbells, but being informed can help us make better decisions about our private conversations in seemingly common spaces, such as sidewalks and the streets. Consumer Reports suggests homeowners who want to have video doorbells get “one where the footage is stored locally (on your own devices rather than in the cloud),” and incorporate “end-to-end encryption” so recordings can’t be easily collected or analyzed by vendors, but that doesn’t eliminate the recording within these broader unseen ranges.
It is a start, but we need to ask ourselves what shared public spaces mean to us and how much autonomy we deserve within them. Is it right that a private conversation someone is having with another person on a sidewalk, or anywhere else where reasonable privacy should be expected, becomes fair game to surveillance technology with invisible listening superpowers? Right now humans are in the loop, but it is only a matter of time before much of this, and policing, will be fully automated, leaving us—and our sidewalks—quiet.
S.A. Applin, PhD, is an anthropologist whose research explores the domains of human agency, algorithms, AI, and automation in the context of social systems and sociability. You can find more at @anthropunk and PoSR.org.