If you could measure all the information you consume online, what would you learn about yourself?
That’s the question behind the new Chrome extension Data Selfie. Created by developers Hang Do Thi Duc and Regina Flores Mir, the application gives users a peek into what kind of digital footprint they might be leaving behind as they browse Facebook–and makes the hidden mechanisms of Facebook’s data collection more transparent.
How It Works
Data Selfie collects data about what you click on (through likes and links), what you type, and what you look at, and for how long. Based on this information, the app compiles a personality profile using personality insights from the supercomputer IBM Watson and the machine learning algorithm Apply Magic Sauce and presents this “data selfie” for you to peruse. In the name of transparency and privacy, all of Data Selfie’s code is on Github, and all of the data it tracks is stored on your personal computer.
The different sections of the data selfie are arranged in tiles, each related to specific information about you: your political orientation, religious affiliation, and your relationship to things in the world. Open up the app, and you see your activity on the site by date and time, with small, color-coded crosses indicating that you looked, liked, clicked a link, or typed. Scroll down and it shows you your 10 top friends and 10 top pages based on time engaged with their posts, as well as your top likes. Then it shows you two lists, one of “keywords,” which are defined as general topics in the content you looked at, and “entities,” defined as people, organizations, and things in the content you looked at. Both are rated in terms of their relevance to you and your positive or negative sentiment toward them.
From there, Data Selfie shows you the relevance of general “concepts,” ideas that dominated the content you looked at, and a personality prediction in the form of a chart showing your percentile of the Big 5 personality traits, which are dimensions commonly used in psychology: openness, conscientiousness, extraversion, agreeableness, and emotional range. It shows you your likely political orientation and religious affiliation based on the content you see in your feed, and makes other predictions about your general intelligence and leadership, by percentile.
My Data Selfie
After using Data Selfie for a week, it has pegged me as liberal and likely not religious. It knows I trust the New York Times for my news because I’ve looked at its posts more than any other page’s. It knows that I have a generally negative perception of Trump, the United States, the Muslim ban, and Uber.
It’s a rather simplistic snapshot, but the set of conclusions has some more interesting (and slightly unsettling) results. The data indicates, for one, that my “psychological gender” is 56% male–whatever that means. In terms of my Big 5 personality traits, the data claims I’m more competitive than trusting and team-oriented; I’m more conservative and traditional than liberal and artistic; and that I’m more laid back than emotional. It’s likely anyone who knows me would have a different account.
But the misalignment of the data-driven personality profile and how I view myself points toward a discrepancy in how we act online and how we act in real life–and raises clear concerns about what Facebook actually knows about me. Given that I’ve only been using Data Selfie for a week and that I’ve been using Facebook actively for the past seven years–not to mention that Facebook’s data tracking is far more sophisticated than Data Selfie and the company actively purchases data it can’t get about its users from third-party data brokers, it’s unsettling to comprehend just how much Facebook likely knows about me.
“The point of data selfie is not to necessarily be 100% accurate,” Flores Mir says. “Of course what we’re doing is only providing you a tiny, tiny sliver of the iceberg of data that’s out there. But we’re trying to reveal to you the process of how [data collection] works.”
So What Does Facebook Know About You?
Flores Mir says that her data selfie made her rethink a pillar of her identity: her belief in environmentalism and human-caused climate change. “In my data selfie, it says I’m not likely to be concerned about the environment,” she says. “I read that and said, that’s not even true! But it made me think about it. Is there a subtlety to the way I interact online that they can tell?”
For Do Thi Duc, the lack of control over her information that Data Selfie made clear was deeply unsettling and gave her a feeling a helplessness–because even if you can see (and delete) data when you’re tracking it through the app, you can’t escape the nagging feeling that Facebook has such a large amount of data that you can’t control. After working on Data Selfie for a year, she now severely limits her time on Facebook and Google, hops from browser to browser, and relies on VPNs for more privacy. Her concerns about digital privacy are related to her background: Do Thi Duc was born in east Germany, where she says that fear over lack of control over personal information is still common. “Historically here in Germany, a lot of people are afraid of the prospect of the wrong people having your information,” she says. “There’s a sense, in Europe, that we don’t know what will happen because things have happened in the past.”
With Trump in the White House, the ethical implications of Americans’ lack of control over their data is all the more pressing. According to a Motherboard article published after the election, a big data company hyper-targeted pro-Trump ads based on personality profiles created by people’s activity on Facebook, not just demographics. When your data is used to sell you things, that’s one thing, but what about when it’s used to sell you a candidate?
“If it’s just Ugg boots and Mac cosmetics, maybe that’s okay,” Flores Mir says. “But if they’re targeting you to try and influence your vote for either Hillary Clinton or Donald Trump–maybe that’s not okay.”
While Data Selfie started as a clever way to show people the discrepancy of their digital habits and their IRL values, it raises important questions about data tracking, privacy, and control over information. And without Do Thi Duc’s cultural memory of East Germany to act as a historical alarm, Americans need tools like this to become more digitally literate–before it’s too late.