Ways to think about machine learning

 

I think one could propose a whole list of unhelpful ways of talking about current developments in machine learning. For example:

  • Data is the new oil
  • Google and China (or Facebook, or Amazon, or BAT) have all the data
  • AI will take all the jobs
  • And, of course, saying AI itself.

More useful things to talk about, perhaps, might be:

  • Automation
  • Enabling technology layers
  • Relational databases.

.. Before relational databases appeared in the late 1970s, if you wanted your database to show you, say, ‘all customers who bought this product and live in this city’, that would generally need a custom engineering project. Databases were not built with structure such that any arbitrary cross-referenced query was an easy, routine thing to do. If you wanted to ask a question, someone would have to build it. Databases were record-keeping systems; relational databases turned them into business intelligence systems.

This changed what databases could be used for in important ways, and so created new use cases and new billion dollar companies. Relational databases gave us Oracle, but they also gave us SAP, and SAP and its peers gave us global just-in-time supply chains – they gave us Apple and Starbucks.

.. with each wave of automation, we imagine we’re creating something anthropomorphic or something with general intelligence. In the 1920s and 30s we imagined steel men walking around factories holding hammers, and in the 1950s we imagined humanoid robots walking around the kitchen doing the housework. We didn’t get robot servants – we got washing machines.

.. Washing machines are robots, but they’re not ‘intelligent’. They don’t know what water or clothes are. Moreover, they’re not general purpose even in the narrow domain of washing – you can’t put dishes in a washing machine, nor clothes in a dishwasher
.. machine learning lets us solve classes of problem that computers could not usefully address before, but each of those problems will require a different implementation, and different data, a different route to market, and often a different company.
.. Machine learning is not going to create HAL 9000 (at least, very few people in the field think that it will do so any time soon), but it’s also not useful to call it ‘just statistics’.
.. this might be rather like talking about SQL in 1980 – how do you get from explaining table joins to thinking about Salesforce.com? It’s all very well to say ‘this lets you ask these new kinds of questions‘, but it isn’t always very obvious what questions.
  1. .. Machine learning may well deliver better results for questions you’re already asking about data you already
  2. .. Machine learning lets you ask new questions of the data you already have. For example, a lawyer doing discovery might search for ‘angry’ emails, or ‘anxious’ or anomalous threads or clusters of documents, as well as doing keyword searches,
  3. .. machine learning opens up new data types to analysis – computers could not really read audio, images or video before and now, increasingly, that will be possible.

.. Within this, I find imaging much the most exciting. Computers have been able to process text and numbers for as long as we’ve had computers, but images (and video) have been mostly opaque.

.. Now they’ll be able to ‘see’ in the same sense as they can ‘read’. This means that image sensors (and microphones) become a whole new input mechanism – less a ‘camera’ than a new, powerful and flexible sensor that generates a stream of (potentially) machine-readable data.  All sorts of things will turn out to be computer vision problems that don’t look like computer vision problems today.

.. I met a company recently that supplies seats to the car industry, which has put a neural network on a cheap DSP chip with a cheap smartphone image sensor, to detect whether there’s a wrinkle in the fabric (we should expect all sorts of similar uses for machine learning in very small, cheap widgets, doing just one thing, as described here). It’s not useful to describe this as ‘artificial intelligence’: it’s automation of a task that could not previously be automated. A person had to look.

.. one of my colleagues suggested that machine learning will be able to do anything you could train a dog to do

..  Ng has suggested that ML will be able to do anything you could do in less than one second.

..  I prefer the metaphor that this gives you infinite interns, or, perhaps, infinite ten year olds. 

.. Five years ago, if you gave a computer a pile of photos, it couldn’t do much more than sort them by size. A ten year old could sort them into men and women, a fifteen year old into cool and uncool and an intern could say ‘this one’s really interesting’. Today, with ML, the computer will match the ten year old and perhaps the fifteen year old. It might never get to the intern. But what would you do if you had a million fifteen year olds to look at your data? What calls would you listen to, what images would you look at, and what file transfers or credit card payments would you inspect?

.. machine learning doesn’t have to match experts or decades of experience or judgement. We’re not automating experts. Rather, we’re asking ‘listen to all the phone calls and find the angry ones’. ‘Read all the emails and find the anxious ones’. ‘Look at a hundred thousand photos and find the cool (or at least weird) people’.

.. this is what automation always does;

  • Excel didn’t give us artificial accountants,
  • Photoshop and Indesign didn’t give us artificial graphic designers and indeed
  • steam engines didn’t give us artificial horses. ..

Rather, we automated one discrete task, at massive scale.

.. Where this metaphor breaks down (as all metaphors do) is in the sense that in some fields, machine learning can not just find things we can already recognize, but find things that humans can’t recognize, or find levels of pattern, inference or implication that no ten year old (or 50 year old) would recognize.

.. This is best seen Deepmind’s AlphaGo. AlphaGo doesn’t play Go the way the chess computers played chess – by analysing every possible tree of moves in sequence. Rather, it was given the rules and a board and left to try to work out strategies by itself, playing more games against itself than a human could do in many lifetimes. That is, this not so much a thousand interns as one intern that’s very very fast, and you give your intern 10 million images and they come back and say ‘it’s a funny thing, but when I looked at the third million images, this pattern really started coming out’.

.. what fields are narrow enough that we can tell an ML system the rules (or give it a score), but deep enough that looking at all of the data, as no human could ever do, might bring out new results?

Ask HN: Which industries will be transformed by ML in 10 years?

The bad industries ones will be transformed before the good ones. What I mean by that is that computer vision applied to medical imaging would be huge. But the detection/classification isn’t accurate enough for that field, just yet. Yes, results are amazing on standard datasets such as ImageNet but they fail to become equally good when there are orders of magnitudes less amount of data. And in the field, accuracy is very important, a net classifying cancer correctly 90 % of the time is likely useless.

One exception is automated language translation which is getting very good. I’m noticing that some of the articles papers I’m reading are machine translated. They appear to apply machine translation to English articles and then have some editor doing manual touch-ups which seldom is enough.

The “bad” industries such as spam and SEO can definitely benefit from ML as it exists today. There are ML algorithms (LSTM) that can generate faked web sites with images that, from Googlebot’s point of view, are completely indistinguishable from real sites. Another use would be to generate realistic looking accounts in social media to steer the conversation, perhaps for political purposes. Porn obviously, could also use ML due to the huge amount of data (the porn itself and user interactions) available.

 

.. I think it’s pretty safe to say finance will be a big one. Finance has a large amount of individuals and firms researching the applications of ML methodologies to financial indicators. With the semi-recent rise of quant firms, I think this research is only going to get more aggressive, and HFT will become more lucrative and more automated as long as regulation does not get in the way.

 

.. HFT – yes. But longer-term investment (i.e. Buffett – or even with a horizon of a couple of years) is unlikely to be transformed soon – ML needs vast historical data, which is very slow to generate. Waiting 10 years only gives you 10 years of history, which is 5 non-overlapping 2-year forward returns, and maybe 1 or 2 economic/financial regimes.

This is also a problem with new datasets being generated – there is not nearly enough history available to test them or feed them to a ML system.

Furthermore, arguably, longer-term investment requires forward-looking modelling of scenarios, based on the kinds of inputs that were not seen in history. ML is not very applicable when you get big covariate shifts.

So I would say human financial analysts are not going anywhere, and any improvements would be relatively small and incremental.

.. HFT is not profitable. Its completely commoditized.

China Could Sell Trump the Brooklyn Bridge

Xi has been brilliant at playing Trump, plying him with flattery and short-term trade concessions and deflecting him from the real structural trade imbalances with China. All along, Xi keeps his eye on the long-term prize of making China great again. Trump, meanwhile, touts every minor victory as historic and proceeds down any road that will give him a quick sugar high.

What world are we in? One in which we’re going through three “climate changes” at once.

  1. We’re going through a change in the actual climate: Destructive weather events and the degradation of ecosystems are steadily accelerating.
  2. We’re going through a change in the “climate” of globalization: from an interconnected world to an interdependent one; from a world of walls, where you build your wealth by hoarding resources, to a world of webs, where you thrive by connecting your citizens to the most flows of ideas, trade, innovation and education.
  3. And, finally, we’re going through a change in the “climate” of technology and work: Machines are acquiring all five senses, and with big data and artificial intelligence, every company can now analyze, optimize, prophesize, customize, digitize and automatize more and more jobs, products and services.

.. while China hails globalization, it imposes a 25 percent tariff on imported cars (while America imposes only 2.5 percent) and 50-50 joint ventures and technology transfers for big companies that want to gain access to China’s giant market. But China gets away with it.

.. plowing government funds and research into commercializing 10 strategic industries while creating regulations and swiping intellectual property from abroad to make them all grow faster. These industries include

  1. electric vehicles,
  2. new materials,
  3. artificial intelligence,
  4. integrated circuits,
  5. biopharmacy,
  6. quantum computing,
  7. 5G mobile communications, and
  8. robotics.

.. And Trump? On the change in the climate, he’s promoting coal over clean energy, like wind and solar, and has appointed climate-change deniers to all of his key environmental posts. While China is run by engineers, Trump doesn’t even have a science adviser.

.. “This will be wounding to one of America’s gems,” its institutions of higher education, Drew Faust, the president of Harvard, said to me. And it’s basically being done to cut taxes for the wealthy.

.. the Chinese are focused on the giant winds of change, and Trump is betting on his gut and a grab bag of tax cuts based on no take on the world, other than dubious trickle-down economics.

.. When you don’t know where you’re going any tax cut will get you there, any replacement for Obamacare will get you there, any wall will get you there, any trade concession will get you there.

.. I’m certain our economic system is better than theirs — in theory.

But China, with its ability to focus, is getting 90 percent out of its inferior system, and it has brought China a long way fast. And we, with too little focus, are getting 50 percent out of our superior system.