On September 18, the British Channel 4 ran a news segment with the headline, ‘Potentially deadly bomb ingredients are ‘frequently bought together’ on Amazon.’
.. The real story in this mess is not the threat that algorithms pose to Amazon shoppers, but the threat that algorithms pose to journalism. By forcing reporters to optimize every story for clicks, not giving them time to check or contextualize their reporting, and requiring them to race to publish follow-on articles on every topic, the clickbait economics of online media encourage carelessness and drama. This is particularly true for technical topics outside the reporter’s area of expertise.
And reporters have no choice but to chase clicks. Because Google and Facebook have a duopoly on online advertising, the only measure of success in publishing is whether a story goes viral on social media. Authors are evaluated by how individual stories perform online, and face constant pressure to make them more arresting. Highly technical pieces are farmed out to junior freelancers working under strict time limits. Corrections, if they happen at all, are inserted quietly through ‘ninja edits’ after the fact.
There is no real penalty for making mistakes, but there is enormous pressure to frame stories in whatever way maximizes page views. Once those stories get picked up by rival news outlets, they become ineradicable. The sheer weight of copycat coverage creates the impression of legitimacy. As the old adage has it, a lie can get halfway around the world while the truth is pulling its boots on.
Earlier this year, when the Guardian published an equally ignorant (and far more harmful) scare piece about a popular secure messenger app, it took a group of security experts six months of cajoling and pressure to shame the site into amending its coverage. And the Guardian is a prestige publication, with an independent public editor. Not every story can get such editorial scrutiny on appeal, or attract the sympathetic attention of Teen Vogue.
The very machine learning systems that Channel 4’s article purports to expose are eroding online journalism’s ability to do its job.
Moral panics like this one are not just harmful to musket owners and model rocket builders. They distract and discredit journalists, making it harder to perform the essential function of serving as a check on the powerful.
The real story of machine learning is not how it promotes home bomb-making, but that it’s being deployed at scale with minimal ethical oversight, in the service of a business model that relies entirely on psychological manipulation and mass surveillance. The capacity to manipulate people at scale is being sold to the highest bidder, and has infected every aspect of civic life, including democratic elections and journalism.
Together with climate change, this algorithmic takeover of the public sphere is the biggest news story of the early 21st century.
We’re all trying to understand why people can’t just get along. The emerging consensus in Silicon Valley is that polarization is a baffling phenomenon, but we can fight it with better fact-checking, with more empathy, and (at least in Facebook’s case) with advanced algorithms to try and guide conversations between opposing camps in a more productive direction.
A question few are asking is whether the tools of mass surveillance and social control we spent the last decade building could have had anything to do with the debacle of the 2017 election, or whether destroying local journalism and making national journalism so dependent on our platforms was, in retrospect, a good idea.
We built the commercial internet by mastering techniques of persuasion and surveillance that we’ve extended to billions of people, including essentially the entire population of the Western democracies. But admitting that this tool of social control might be conducive to authoritarianism is not something we’re ready to face. After all, we’re good people. We like freedom. How could we have built tools that subvert it?
.. The economic basis of the Internet is surveillance. Every interaction with a computing device leaves a data trail, and whole industries exist to consume this data.
.. It is the primary source of news for a sizable fraction of Americans, and through its feed algorithm (which determines who sees what) has an unparalleled degree of editorial control over what that news looks like.
.. Together, these companies control some 65% of the online ad market, which in 2015 was estimated at $60B. Of that, half went to Google and $8B to Facebook.
.. These companies exemplify the centralized, feudal Internet of 2017. While the protocols that comprise the Internet remain open and free, in practice a few large American companies dominate every aspect of online life. Google controls search and email, AWS controls cloud hosting, Apple and Google have a duopoly in mobile phone operating systems. Facebook is the one social network.
.. There are two interlocking motives for this data hunger: to target online advertising, and to train machine learning algorithms.
.. A considerable fraction (only Google and Facebook have the numbers) of the money sloshing around goes to scammers.
.. The more poorly current ads perform, the more room there is to tell convincing stories about future advertising technology, which of course will require new forms of surveillance.
.. The real profits from online advertising go to the companies running the casino—Facebook and Google.
.. we assumed that when machines reached near-human performance in tasks like image recognition, it would be thanks to fundamental breakthroughs into the nature of cognition. We would be able to lift the lid on the human mind and see all the little gears turning.
What’s happened instead is odd. We found a way to get terrific results by combining fairly simple math with enormous data sets. But this discovery did not advance our understanding. The mathematical techniques used in machine learning don’t have a complex, intelligible internal structure we can reason about. Like our brains, they are a wild, interconnected tangle.
.. The algorithms learn to show people the things they are most likely to ‘engage’ with—click, share, view, and react to. We make them very good at provoking these reactions from people.
.. If you concede that they work just as well for politics as for commerce, you’re inviting government oversight. If you claim they don’t work well at all, you’re telling advertisers they’re wasting their money.
Facebook and Google have tied themselves into pretzels over this. The idea that these mechanisms of persuasion could be politically useful, and especially that they might be more useful to one side than the other, violates cherished beliefs about the “apolitical” tech industry.
.. All the algorithms know is what they measure, which is the same for advertising as it is in politics: engagement, time on site, who shared what, who clicked what, and who is likely to come back for more.
The persuasion works, and it works the same way in politics as it does in commerce—by getting a rise out of people.
But political sales techniques that maximize “engagement” have troubling implications in a democracy.
.. One problem is that any system trying to maximize engagement will try to push users towards the fringes. You can prove this to yourself by opening YouTube in an incognito browser (so that you start with a blank slate), and clicking recommended links on any video with political content. When I tried this experiment last night, within five clicks I went from a news item about demonstrators clashing in Berkeley to a conspiracy site claiming Trump was planning WWIII with North Korea, and another exposing FEMA’s plans for genocide.
This pull to the fringes doesn’t happen if you click on a cute animal story. In that case, you just get more cute animals (an experiment I also recommend trying). But the algorithms have learned that users interested in politics respond more if they’re provoked more, so they provoke. Nobody programmed the behavior into the algorithm; it made a correct observation about human nature and acted on it.
Social dynamics on sites where people share links can compound this radicalizing force. The way to maximize engagement on Twitter, for example, is to say provocative things, or hoist an opponent’s tweets out of context in order to use them as a rhetorical bludgeon. Twitter rewards the captious.
.. So without explicitly coding for this behavior, we already have a dynamic where people are pulled to the extremes. Things get worse when third parties are allowed to use these algorithms to target a specific audience.
.. Political speech that tries to fly below the radar has always existed, but in the past it was possible to catch it and call it out. When no two people see the same thing, it becomes difficult to trace orchestrated attempts to target people in political campaigns. These techniques of micro-targeted political advertising were used to great effect in both the Brexit vote and the US election.
.. This is an inversion in political life that we haven’t seen before. Conversations between people that used to be private, or semi-private, now take place on public forums where they are archived forever. Meanwhile, the kind of political messaging that used to take place in public view is now visible only to an audience of one.
.. Politically engaged people spend more time online and click more ads. Alarmist and conspiracy-minded consumers also make good targets for certain kinds of advertising. Listen to talk radio or go to prepper websites and you will find pure hucksterism—supplements, gold coins, mutual funds—being pitched by the same people who deliver the apocalyptic theories.
Many of the sites peddling fake news during the election operated solely for profit, and field-tested articles on both sides of the political spectrum. This time around, they found the right to be more lucrative, so we got fake news targeted at Trump voters.
.. Apart from the obvious chilling effect on political expression when everything you say is permanently recorded, there is the chilling effect of your own peer group, and the lingering doubt that anything you say privately can ever truly stay private.
.. Orwell imagined a world in which the state could shamelessly rewrite the past. The Internet has taught us that people are happy to do this work themselves, provided they have their peer group with them, and a common enemy to unite against. They will happily construct alternative realities for themselves, and adjust them as necessary to fit the changing facts.
Finally, surveillance capitalism makes it harder to organize effective long-term dissent. In an setting where attention is convertible into money, social media will always reward drama, dissent, conflict, iconoclasm and strife. There will be no comparable rewards for cooperation, de-escalation, consensus-building, or compromise, qualities that are essential for the slow work of building a movement. People who should be looking past their differences will instead spend their time on purity tests and trying to outflank one another in a race to the fringes.
.. Moreover, powerful people have noted and benefited from the special power of social media in the political arena. They will not sit by and let programmers dismantle useful tools for influence and social control. It doesn’t matter that the tech industry considers itself apolitical and rationalist. Powerful people did not get to be that way by voluntarily ceding power.
.. Consider the example of the Women’s March. The March was organized on Facebook, and 3-4 million people attended. The list of those who RSVP’d is now stored on Facebook servers and will be until the end of time, or until Facebook goes bankrupt, or gets hacked, or bought by a hedge fund, or some rogue sysadmin decides that list needs to be made public.
.. We need the parts of these sites that are used heavily for organizing, like Google Groups or Facebook event pages, to become more ephemeral
.. These features are sometimes called ‘disappearing’, but there is nothing furtive about it. Rather, this is just getting our software to more faithfully reflect human life.
.. You don’t carry all your valuables and private documents when you travel. Similarly, social sites should offer a trip mode where the view of your account is limited to recent contacts and messages.
.. I’ve pushed for “Six Fixes” to the Internet. I’ll push for them again!
- The right to examine, download, and delete any data stored about you. A time horizon (weeks, not years) for how long companies are allowed to retain behavioral data (any data about yourself you didn’t explicitly provide).
- A prohibition on selling or transferring collections of behavioral data, whether outright, in an acquisition, or in bankruptcy.
- A ban on third-party advertising. Ad networks can still exist, but they can only serve ads targeted against page content, and they cannot retain information between ad requests.
- An off switch on Internet-connected devices, that physically cuts their access to the network. This switch should not prevent the device from functioning offline. You should be able to stop the malware on your refrigerator from posting racist rants on Twitter while still keeping your beer cold.
- A legal framework for offering certain privacy guarantees, with enforceable consequences. Think of this as a Creative Commons for privacy. If they can be sure data won’t be retained, users will be willing to experiment with many technologies that would pose too big a privacy risk in the current reality.
.. At a minimum, we need to break up Facebook so that its social features are divorced from the news feed.
.. But it cannot simultaneously be the platform for political organizing, political campaigns, and news delivery.
.. Shareholder pressure doesn’t work, because the large tech companies are structured to give founders absolute control no matter how many shares they own.
.. The one effective lever we have against tech companies is employee pressure. Software engineers are difficult to hire, expensive to train, and take a long time to replace. Small teams in critical roles (like operations or security) have the power to shut down a tech company if they act in concert.
.. Unfortunately, the enemy is complacency. Tech workers trust their founders, find labor organizing distasteful, and are happy to leave larger ethical questions to management. A workplace free of ‘politics’ is just one of the many perks the tech industry offers its pampered employees. So our one chance to enact meaningful change is slipping away.
Last week Forbes even went to the extent of calling the social graph an exploitable resource comparable to crude oil, with riches to those who figure out how to mine it and refine it.
.. In order to model something as a graph, you have to have a clear definition of what its nodes and edges represent. In most social sites, this does not pose a problem. The nodes are users, while edges means something like ‘accepted a connection request from’, or ‘followed’, or ‘exchanged email with’, depending on where you are.
.. the old country, for example, we have two kinds of ‘friendship’ (distinguished by whether you address one another with the informal pronoun) and going from one status to the other is a pretty big deal; you have to drink a toast with your arms all in a pretzel and it’s considered a huge faux pas to suggest it before both people feel ready. But at least it’s not ambiguous!
.. There’s also the matter of things that XFN doesn’t allow you to describe. There’s no nemesisor rival, since the standards writers wanted to exclude negativity. The gender-dependent second e on fiancé(e) panicked the spec writers, so they left that relationship out. Neither will they allow you to declare an ex-spouse or an ex-colleague.
.. You can call this nitpicking, but this stuff matters! This is supposed to be a canonical representation of human relationships. But it only takes five minutes of reading the existing standards to see that they’re completely inadequate.
.. This obsession with modeling has led us into a social version of the Uncanny Valley, that weird phenomenon from computer graphics where the more faithfully you try to represent something human, the creepier it becomes. As the model becomes more expressive, we really start to notice the places where it fails.
.. The problem FOAF ran headlong into was that declaring relationships explicitly is a social act. Documenting my huge crush on Matt in an XML snippet might faithfully reflect the state of the world, but it also broadcasts a strong signal about me to others, and above all to Matt. The essence of a crush is that it’s furtive, so by declaring it in this open (but weirdly passive) way I’ve turned it into something different
.. Declaring connections is about as much fun as trying to whittle people from a guest list, with the added stress that social networking is too new for us to have shared social conventions around it.
.. The social graph wants to turn us back into third graders, laboriously spelling out just who is our fifth-best-friend. But there’s a reason we stopped doing that kind of thing in third grade!
.. Asking computer nerds to design social software is like hiring a Mormon bartender.
.. friendship is not transitive. There’s just no way to tell if you’ll get along with someone in my social circle, no matter how many friends we have in common.
.. Imagine the U.S. Census as conducted by direct marketers – that’s the social graph.
Social networks exist to sell you crap. The icky feeling you get when your friend starts to talk to you about Amway, or when you spot someone passing out business cards at a birthday party, is the entire driving force behind a site like Facebook.
.. We have a name for the kind of person who collects a detailed, permanent dossier on everyone they interact with, with the intent of using it to manipulate others for personal advantage – we call that person a sociopath.
.. Open data advocates tell us the answer is to reclaim this obsessive dossier for ourselves, so we can decide where to store it. But this misses the point of how stifling it is to have such a permanent record in the first place
.. Give people something cool to do and a way to talk to each other, moderate a little bit, and your job is done. Games like Eve Online or WoW have developed entire economies on top of what’s basically a message board. MetaFilter, Reddit, LiveJournal and SA all started with a couple of buttons and a textfield and have produced some fascinating subcultures. And maybe the purest (!) example is 4chan, a Lord of the Flies community that invents all the stuff you end up sharing elsewhere: image macros, copypasta, rage comics, the lolrus. The data model for 4chan is three fields long – image, timestamp, text.
Now tell me one bit of original culture that’s ever come out of Facebook.