Have Fun with Machine Learning: A Guide for Beginners

This is a hands-on guide to machine learning for programmers with no background in AI. Using a neural network doesn’t require a PhD, and you don’t need to be the person who makes the next breakthrough in AI in order to use what exists today. What we have now is already breathtaking, and highly usable. I believe that more of us need to play with this stuff like we would any other open source technology, instead of treating it like a research topic.

In this guide our goal will be to write a program that uses machine learning to predict, with a high degree of certainty, whether the images in data/untrained-samples are of dolphins or seahorses using only the images themselves, and without having seen them before. Here are two example images we’ll use:

Top-down learning path: Machine Learning for Software Engineers

This is my multi-month study plan for going from mobile developer (self-taught, no CS degree) to machine learning engineer.

My main goal was to find an approach to studying Machine Learning that is mainly hands-on and abstracts most of the Math for the beginner. This approach is unconventional because it’s the top-down and results-first approach designed for software engineers.

Challenges and Opportunities Confront the Data-Driven Business

Most companies capture a small fraction of their data’s value

It’s often been said that truly transformative innovations are overhyped in the short term but under-hyped in the long term. Think of electricity and automobiles, the internet more recently and now big data.

When first developed in the late 19th century, electricity was mostly used to replace kerosene lamps and candles with light bulbs. It took several decades for electric appliances, the assembly line and mass production to emerge and help create whole new industries. Similarly, the full impact of automobiles was not felt until the mid-20th century with the rise of suburbs, the Interstate Highway system, and the motels, restaurants and gas stations that sprung up all around them.

 .. In 2000, only one-quarter of the world’s stored information was digital and thus subject to search and analysis. Since then, the amount of digital data has been doubling roughly every three years. By now only a small amount of all stored information isn’t digital, around 1% or so. This could not have possibly happened without the digital revolution
..

  • Micro-segmenting a population based on individuals’ characteristics as revealed by data and analytics;

Introduction to K-means Clustering

This introduction to the K-means clustering algorithm covers:

  • Common business cases where K-means is used
  • The steps involved in running the algorithm
  • A Python example using delivery fleet data

 

<span class="token keyword">import</span> numpy <span class="token keyword">as</span> np
<span class="token keyword">from</span> sklearn<span class="token punctuation">.</span>cluster <span class="token keyword">import</span> KMeans

<span class="token comment" spellcheck="true">### For the purposes of this example, we store feature data from our</span>
<span class="token comment" spellcheck="true">### dataframe `df`, in the `f1` and `f2` arrays. We combine this into</span>
<span class="token comment" spellcheck="true">### a feature matrix `X` before entering it into the algorithm.</span>
f1 <span class="token operator">=</span> df<span class="token punctuation">[</span><span class="token string">'Speeding Feature'</span><span class="token punctuation">]</span><span class="token punctuation">.</span>values<span class="token punctuation">(</span><span class="token punctuation">)</span>
f2 <span class="token operator">=</span> df<span class="token punctuation">[</span><span class="token string">'Distance Feature'</span><span class="token punctuation">]</span><span class="token punctuation">.</span>values<span class="token punctuation">(</span><span class="token punctuation">)</span>

X<span class="token operator">=</span>np<span class="token punctuation">.</span>matrix<span class="token punctuation">(</span>zip<span class="token punctuation">(</span>f1<span class="token punctuation">,</span>f2<span class="token punctuation">)</span><span class="token punctuation">)</span>
kmeans <span class="token operator">=</span> KMeans<span class="token punctuation">(</span>n_clusters<span class="token operator">=</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token punctuation">.</span>fit<span class="token punctuation">(</span>X<span class="token punctuation">)</span>