algorithms

How Facebook Figures Out Everyone You’ve Ever Met

Shadow contact information has been a known feature of Facebook for a few years now. But most users remain unaware of its reach and power. Because shadow-profile connections happen inside Facebook’s algorithmic black box, people can’t see how deep the data-mining of their lives truly is, until an uncanny recommendation pops up.

Facebook isn’t scanning the work email of the attorney above. But it likely has her work email address on file, even if she never gave it to Facebook herself. If anyone who has the lawyer’s address in their contacts has chosen to share it with Facebook, the company can link her to anyone else who has it, such as the defense counsel in one of her cases.

.. Handing over address books is one of the first steps Facebook asks people to take when they initially sign up, so that they can “Find Friends.”

.. Having issued this warning, and having acknowledged that people in your address book may not necessarily want to be connected to you, Facebook will then do exactly what it warned you not to do.

.. Facebook doesn’t like, and doesn’t use, the term “shadow profiles.” It doesn’t like the term because it sounds like Facebook creates hidden profiles for people who haven’t joined the network, which Facebook says it doesn’t do. The existence of shadow contact information came to light in 2013 after Facebook admitted it had discovered and fixed “a bug.” The bug was that when a user downloaded their Facebook file, it included not just their friends’ visible contact information, but also their friends’ shadow contact information.

.. Facebook does what it can to underplay how much data it gathers through contacts, and how widely it casts its net.

.. Through the course of reporting this story, I discovered that many of my own friends had uploaded their contacts. While encouraging me to do the same, Facebook’s smartphone app told me that 272 of my friends have already done so. That’s a quarter of all my friends.

.. When Steinfeld wrote “a friend or someone you might know,” he meant anyone—any person who might at some point have labeled your phone number or email or address in their own contacts. A one-night stand from 2008, a person you got a couch from on Craiglist in 2010, a landlord from 2013: If they ever put you in their phone, or you put them in yours, Facebook could log the connection if either party were to upload their contacts.

.. That accumulation of contact data from hundreds of people means that Facebook probably knows every address you’ve ever lived at, every email address you’ve ever used, every landline and cell phone number you’ve ever been associated with, all of your nicknames, any social network profiles associated with you, all your former instant message accounts, and anything else someone might have added about you to their phone book.

As far as Facebook is concerned, none of that even counts as your own information. It belongs to the users who’ve uploaded it, and they’re the only ones with any control over it.

.. It’s what the sociologist danah boyd calls “networked privacy”: All the people who know you and who choose to share their contacts with Facebook are making it easier for Facebook to make connections you may not want it to make—say if you’re in a profession like law, medicine, social work, or even journalism, where you might not want to be connected to people you encounter at work, because of what it could reveal about them or you, or because you may not have had a friendly encounter with them.

.. If just one person you know has contact information for both identities and gives Facebook access to it, your worlds collide. Bruce Wayne and Clark Kent would be screwed.

.. The company’s ability to perceive the threads connecting its billion-plus users around the globe led it to announce last year that it’s not six degrees that separate one person from another—it’s just three and a half.

.. The network can do contact chaining—if two different people both have an email address or phone number for you in their contact information, that indicates that they could possibly know each other, too.

.. This is how a psychiatrist’s patients were recommended to one another and may be why a man had his secret biological daughter recommended to him. (He and she would have her parents’ contact information in common.)

.. And it may explain why a non-Facebook user had his ex-wife recommended to his girlfriend. Facebook doesn’t keep profiles for non-users, but it does use their contact information to connect people.

.. “Mobile phone numbers are even better than social security numbers for identifying people,” said security technologist Bruce Schneier by email. “People give them out all the time, and they’re strongly linked to identity.”

.. the social network is getting our not-for-sharing numbers and email addresses anyway by stealing them (albeit through ‘legitimate’ means) from our friends.”

What if you don’t like Facebook having this data about you? All you need to do is find every person who’s ever gotten your contact information and uploaded it to Facebook, and then ask them one by one to go to Facebook’s contact management page and delete it.

.. Facebook functions as a reverse phone-number look-up service; under the default settings, anyone can put your phone number into the search bar and pull up your account

.. “You can limit who can look you up on Facebook by that phone number [or email address] to ‘friends.’ This is also a signal that People You May Know uses.

.. So if a stranger uploads his address book including that phone number [or email address, it] won’t be used to suggest you to that stranger in People You May Know.”

Zuckerberg’s Preposterous Defense of Facebook

Are you bothered by fake news, systematic misinformation campaigns and Facebook “dark posts” — micro-targeted ads not visible to the public — aimed at African-Americans to discourage them from voting? You must be one of those people “upset about ideas” you disagree with.

Are you troubled when agents of a foreign power pose online as American Muslims and post incendiary content that right-wing commentators can cite as evidence that all American Muslims are sympathizers of terrorist groups like the Islamic State? Sounds like you can’t handle a healthy debate.

Does it bother you that Russian actors bought advertisements aimed at swing states to sow political discord during the 2016 presidential campaign, and that it took eight months after the election to uncover any of this? Well, the marketplace of ideas isn’t for everyone.

.. bias in the digital sphere is structurally different from that in mass media, and a lot more complicated than what programmers believe.

.. what matters most is not the political beliefs of the employees but the structures, algorithms and incentives they set up, as well as what oversight, if any, they employ to guard against deception, misinformation and illegitimate meddling.

.. by design, business model and algorithm, Facebook has made it easy for it to be weaponized to spread misinformation and fraudulent content.

.. this business model is also lucrative, especially during elections. Sheryl Sandberg, Facebook’s chief operating officer, called the 2016 election “a big deal in terms of ad spend” for the company

.. Facebook responds to such pressure as much of the traditional media do: by caving and hiding behind flimsy “there are two sides to everything” arguments.

.. Even the conservative pundit and wild-eyed conspiracy theorist Glenn Beck, of all people, has expressed befuddlement at the charge that Facebook censored conservative content.

.. He has correctly pointed out that Facebook had been a boon for right-wing groups, especially of the alt-right and Breitbart variety

Anatomy of a Moral Panic

On September 18, the British Channel 4 ran a news segment with the headline, ‘Potentially deadly bomb ingredients are ‘frequently bought together’ on Amazon.’

.. The real story in this mess is not the threat that algorithms pose to Amazon shoppers, but the threat that algorithms pose to journalism. By forcing reporters to optimize every story for clicks, not giving them time to check or contextualize their reporting, and requiring them to race to publish follow-on articles on every topic, the clickbait economics of online media encourage carelessness and drama. This is particularly true for technical topics outside the reporter’s area of expertise.

And reporters have no choice but to chase clicks. Because Google and Facebook have a duopoly on online advertising, the only measure of success in publishing is whether a story goes viral on social media. Authors are evaluated by how individual stories perform online, and face constant pressure to make them more arresting. Highly technical pieces are farmed out to junior freelancers working under strict time limits. Corrections, if they happen at all, are inserted quietly through ‘ninja edits’ after the fact.

There is no real penalty for making mistakes, but there is enormous pressure to frame stories in whatever way maximizes page views. Once those stories get picked up by rival news outlets, they become ineradicable. The sheer weight of copycat coverage creates the impression of legitimacy. As the old adage has it, a lie can get halfway around the world while the truth is pulling its boots on.

Earlier this year, when the Guardian published an equally ignorant (and far more harmful) scare piece about a popular secure messenger app, it took a group of security experts six months of cajoling and pressure to shame the site into amending its coverage. And the Guardian is a prestige publication, with an independent public editor. Not every story can get such editorial scrutiny on appeal, or attract the sympathetic attention of Teen Vogue.

The very machine learning systems that Channel 4’s article purports to expose are eroding online journalism’s ability to do its job.

Moral panics like this one are not just harmful to musket owners and model rocket builders. They distract and discredit journalists, making it harder to perform the essential function of serving as a check on the powerful.

The real story of machine learning is not how it promotes home bomb-making, but that it’s being deployed at scale with minimal ethical oversight, in the service of a business model that relies entirely on psychological manipulation and mass surveillance. The capacity to manipulate people at scale is being sold to the highest bidder, and has infected every aspect of civic life, including democratic elections and journalism.

Together with climate change, this algorithmic takeover of the public sphere is the biggest news story of the early 21st century.