Why Kubernetes is winning the container war

The technology was so good that Urs Holzle, then Google’s head of technical infrastructure, reacted with disbelief when a few Google engineers suggested building a version of Borg and open sourcing it:

So let me get this straight. You want to build an external version of the Borg task scheduler. One of our most important competitive advantages. The one we don’t even talk about externally. And, on top of that, you want to open source it?

Those engineers used Borg as a cluster management tool that powered the infrastructure behind Gmail, YouTube, Google Search, and other popular Google services.

.. Later it was built into the Google Compute Engine, but the engineers noticed that customers were spinning up CPUs with terrible utilization rates. A container management system was needed

.. “We think over time that our deep, comprehensive support for containers on Google Cloud Platform will create a gravity well in the market for container based apps and that a significant percentage of them will end up with us.”

Google Wants Kubernetes To Rule The World

“What we are seeing is that for new applications, Google developers are looking at Borg or Kubernetes, and many of them are choosing Kubernetes,” says Sinha. “But I don’t think that it is practical to think that Gmail or search can move to Kubernetes.”

.. We have many Borg clusters up and running, we have policies inside of Google that are all or nothing, so we can’t just upgrade one cluster to Kubernetes. Kubernetes is also missing hundreds and hundreds of features that Borg has – and whether they are good features or not is a good question, but these are things that Borg has and that people use.

We don’t want to adopt all of those features in Kubernetes. So to bring Kubernetes in instead of Borg is an incredible challenge. That may never happen, or it may be on a five to ten year track, or I can imagine a certain end game where internally Borg has a dozen big customers and everyone else uses Kubernetes on our cloud.”

.. “You can run Kubernetes on virtual machines, on bare metal, on any cloud, and that is the beauty of it. It gives you that choice. You don’t just have a choice of clouds. You have a choice of storage, networks, and schedulers and you can plug those in as well, and this is what makes Kubernetes more applicable to the enterprise because they can tailor it to their environment.”

.. “We are definitely shooting for dozens if not low hundreds of clusters in a federation, and each cluster could have from 2,000 to 5,000 nodes and up to 60,000 pods,” says Hockin “If you take a dozen clusters in a dozen cloud regions times 5,000 nodes each, you have got quite a heap of machines.” (That’s 720,000 nodes if you want to be precise, and that is a lot of iron, even if a node is just a VM. At current densities of maybe 40 VMs per two-socket server, that is still 18,000 physical servers.)

How Google is Challenging AWS

If Amazon wanted to stimulate creativity among its developers, it shouldn’t try to guess what kind of services they might want; such guesses would be based on patterns of the past. Instead, it should be creating primitives — the building blocks of computing — and then getting out of the way.


Google, meanwhile, has never really been a platform company; in fact, while Google is often cast as Apple’s opposite — the latter is called a product company, and the former a services one — that only makes sense if you presume that only hardware can be a product. A more expansive definition of “product” — a fully realized solution presented to end users — would show the two companies are in fact quite similar.

.. this is the exact opposite of the model employed by not just Amazon but also Microsoft, the pre-eminent platform company of the IT era: instead of integrating pieces to deliver a product AWS went in the opposite direction, breaking down all of the pieces that go into building back-end services into fully modular parts; Microsoft did the same with its Win32 API. Yes, this meant that Windows was by design a worse platform in terms of the end user experience than, say, Mac OS, but it was far more powerful and extensible, an approach that paid off with millions of line of business apps that even today keep Windows at the center of business. AWS has done the exact same thing for back-end services, and the flexibility and modularity of AWS is the chief reason why it crushed Google’s initial cloud offering, Google App Engine, which launched back in 2008. Using App Engine entailed accepting a lot of decisions that Google made on your behalf; AWS let you build exactly what you needed.

.. Where Kubernetes differs from Borg is that it is fully portable: it runs on AWS, it runs on Azure, it runs on the Google Cloud Platform, it runs on on-premise infrastructure, you can even run it in your house.

.. the potential impact of Kubernetes specifically and container-based development broadly is to make irrelevant which infrastructure provider you use. No wonder it is one of the fastest growing open-source projects of all time: there is no lock-in.

.. its reliance on links instead of simply page content — meant that as the web got bigger Google, unlike its competitors, got better.

..  when you can access any service, whether that be news or car-sharing or hotels or video or search etc., the one that is the best will not only win initially but will see its advantages compound.

.. Kubernetes was Google’s attempt to effectively build a browser on top of cloud infrastructure and thus decrease switching costs

.. superior machine learning offerings can not only be a differentiator but a sustainable one: being better will attract more customers and thus more data, and data is the fuel by which machine learning improvement comes about. And it is because of data that Google is AWS’ biggest threat in the cloud.

.. in TensorFlow and Monetizing Intellectual Property Google’s willingness to share its approach was an implicit admission that its superior data and processing infrastructure was a sustainable advantage.

.. the creation of the Google Cloud Machine Learning group .. they are tasked with productizing Google’s machine learning capabilities.

.. it’s often easier to change the rules of competition than to change your fundamental nature as a company.

.. a new business model — sales versus ads — and build up the sort of organization that is necessary for not just sales but also enterprise support.

.. Microsoft is likely to prove particularly formidable in this regard: not only has the company engaged in years of research, but the company also has experience productizing technology for business specifically; Google’s longstanding consumer focus may at times be a handicap. And as popular as Kubernetes may be broadly, it’s concerning that Google is not yet eating its own dog food.