It should be possible to automatically identify dubious news sources — but we’ll need a lot more data
You’ve probably heard of machine learning and artificial intelligence, but are you sure you know what they…
I think one could propose a whole list of unhelpful ways of talking about current developments in machine learning. For example:
- Data is the new oil
- Google and China (or Facebook, or Amazon, or BAT) have all the data
- AI will take all the jobs
- And, of course, saying AI itself.
More useful things to talk about, perhaps, might be:
- Enabling technology layers
- Relational databases.
.. Before relational databases appeared in the late 1970s, if you wanted your database to show you, say, ‘all customers who bought this product and live in this city’, that would generally need a custom engineering project. Databases were not built with structure such that any arbitrary cross-referenced query was an easy, routine thing to do. If you wanted to ask a question, someone would have to build it. Databases were record-keeping systems; relational databases turned them into business intelligence systems.
This changed what databases could be used for in important ways, and so created new use cases and new billion dollar companies. Relational databases gave us Oracle, but they also gave us SAP, and SAP and its peers gave us global just-in-time supply chains – they gave us Apple and Starbucks.
.. with each wave of automation, we imagine we’re creating something anthropomorphic or something with general intelligence. In the 1920s and 30s we imagined steel men walking around factories holding hammers, and in the 1950s we imagined humanoid robots walking around the kitchen doing the housework. We didn’t get robot servants – we got washing machines... machine learning lets us solve classes of problem that computers could not usefully address before, but each of those problems will require a different implementation, and different data, a different route to market, and often a different company... Machine learning is not going to create HAL 9000 (at least, very few people in the field think that it will do so any time soon), but it’s also not useful to call it ‘just statistics’... this might be rather like talking about SQL in 1980 – how do you get from explaining table joins to thinking about Salesforce.com? It’s all very well to say ‘this lets you ask these new kinds of questions‘, but it isn’t always very obvious what questions.
- .. Machine learning may well deliver better results for questions you’re already asking about data you already
- .. Machine learning lets you ask new questions of the data you already have. For example, a lawyer doing discovery might search for ‘angry’ emails, or ‘anxious’ or anomalous threads or clusters of documents, as well as doing keyword searches,
- .. machine learning opens up new data types to analysis – computers could not really read audio, images or video before and now, increasingly, that will be possible.
.. Within this, I find imaging much the most exciting. Computers have been able to process text and numbers for as long as we’ve had computers, but images (and video) have been mostly opaque.
.. Now they’ll be able to ‘see’ in the same sense as they can ‘read’. This means that image sensors (and microphones) become a whole new input mechanism – less a ‘camera’ than a new, powerful and flexible sensor that generates a stream of (potentially) machine-readable data. All sorts of things will turn out to be computer vision problems that don’t look like computer vision problems today.
.. I met a company recently that supplies seats to the car industry, which has put a neural network on a cheap DSP chip with a cheap smartphone image sensor, to detect whether there’s a wrinkle in the fabric (we should expect all sorts of similar uses for machine learning in very small, cheap widgets, doing just one thing, as described here). It’s not useful to describe this as ‘artificial intelligence’: it’s automation of a task that could not previously be automated. A person had to look.
.. one of my colleagues suggested that machine learning will be able to do anything you could train a dog to do
.. Ng has suggested that ML will be able to do anything you could do in less than one second.
.. I prefer the metaphor that this gives you infinite interns, or, perhaps, infinite ten year olds.
.. Five years ago, if you gave a computer a pile of photos, it couldn’t do much more than sort them by size. A ten year old could sort them into men and women, a fifteen year old into cool and uncool and an intern could say ‘this one’s really interesting’. Today, with ML, the computer will match the ten year old and perhaps the fifteen year old. It might never get to the intern. But what would you do if you had a million fifteen year olds to look at your data? What calls would you listen to, what images would you look at, and what file transfers or credit card payments would you inspect?
.. machine learning doesn’t have to match experts or decades of experience or judgement. We’re not automating experts. Rather, we’re asking ‘listen to all the phone calls and find the angry ones’. ‘Read all the emails and find the anxious ones’. ‘Look at a hundred thousand photos and find the cool (or at least weird) people’.
.. this is what automation always does;
- Excel didn’t give us artificial accountants,
- Photoshop and Indesign didn’t give us artificial graphic designers and indeed
- steam engines didn’t give us artificial horses. ..
Rather, we automated one discrete task, at massive scale.
.. Where this metaphor breaks down (as all metaphors do) is in the sense that in some fields, machine learning can not just find things we can already recognize, but find things that humans can’t recognize, or find levels of pattern, inference or implication that no ten year old (or 50 year old) would recognize.
.. This is best seen Deepmind’s AlphaGo. AlphaGo doesn’t play Go the way the chess computers played chess – by analysing every possible tree of moves in sequence. Rather, it was given the rules and a board and left to try to work out strategies by itself, playing more games against itself than a human could do in many lifetimes. That is, this not so much a thousand interns as one intern that’s very very fast, and you give your intern 10 million images and they come back and say ‘it’s a funny thing, but when I looked at the third million images, this pattern really started coming out’.
.. what fields are narrow enough that we can tell an ML system the rules (or give it a score), but deep enough that looking at all of the data, as no human could ever do, might bring out new results?
It’s ironic on so many levels. The first level is, of course, the irony of refusing to do military work at a company that only exists because of defense contractors working on a military project (ARPANET).
More generally, the irony is that the U.S. military has made a far greater positive contribution to the world than Google. Under the Pax Americana, we have seen the greatest number of people rise out of abject poverty in human history. The stable, liberal world order that has been beneficial to so many people has been bankrolled by the U.S. and backed by the U.S. military.
This world where people in India and Pakistan are using Gchat Facebook to talk to each other instead of waging nuclear war against each other is not the result of Google or Facebook. It’s not the result of humans evolving beyond their tendencies towards warfare. It’s because the U.S. military has made entire classes of armed conflicts untenable... This country only exists because of Puritan persecution in England, but you don’t see me thanking the Anglican church for America. Bad means result in good ends all the time; that doesn’t mean we should celebrate bad means.I agree that the US military has made positive contributions to the world, but I don’t think it’s the main source of the Pax Americana — strong international bodies (NATO, UN, &c), the tendency for democratic nations (the dominant sort in the 20th century) to avoid wars with each other, and advancements in crop science are all individually more responsible for the relative global stability of the last 30 years.
I don’t deny that the military had a role (usually financial) in any or all of the above, but I wouldn’t call it a causal role: virtually all academic research funding hits the defense world eventually (“food security”, “ecological security”, &c), especially during the Cold War. That’s the result of political contrivances, not any sort of deep connection between the U.S. military and scientific progress.
Finally, I wonder about drawing comparisons between the past U.S. military and current ventures. The Google engineers in question probably wouldn’t be designing waterproof radios for fastboats; they’d be training models that “recognize” “terrorists” from afar and systems that pass that information to drones for remote killing. Put another way: the shift away from conventional warfare changes the moral dimensions of working for the military.
.. It’s not that NATO and the UN are powerful or effective as bodies independent of the US, it’s that their greatest achievements have occurred without direct US military intervention.NATO and the UN both benefit (and suffer) from the power and presence of the US military, but their proudest moments (the German economic miracle, smallpox eradication, historic decreases in child mortality and malnutrition) all stem from smart policy and liberal principles, not from the looming threat of American tanks... The fact that people are spurred into action by violence doesn’t mean that we ought to be violent, or that violence is even the most effective way to get people to act in the way you want.
.. At the risk of sort of invoking Godwin, I’m curious if you’d apply the same logic to Stalin and the fall of Nazi Germany. What does the victory in the Battle of Berlin say about whether Stalin was good or bad?.. an academic institution is a public institution focused on advancing science and teaching it to the public. A corporation or contractor is in it for the profit. Notably, the academics involved were able to publish their work as the TCP/IP standard (and others), and anyone was able to use it.If it had actually been military contractors we would not have the internet as it is today... Cerf (Stanford) and Kahn (DARPA) designed TCP. But BBN (now a subsidiary of Raytheon) built ARPANET... The best the military does is keep stability. Google changed everything. One is static; the other, change. I don’t see them as comparable at all.Anyway I’ve heard it said that the container ship has done more to lift the world standard up, than all the political action of the last 1000 years. By distributing goods to all the corners of the world at a cheap price.
.. > Under the Pax Americana, we have seen the greatest number of people rise out of abject poverty in human history. The stable, liberal world order that has been beneficial to so many people has been bankrolled by the U.S. and backed by the U.S. military.It’s perfectly coherent to support the overall ends (relative world peace) and oppose the means (extrajudicial drone strikes, invasion of Iraq, etc.)