Related Pins at Pinterest: The Evolution of a Real-World Recommender System

Related Pins is the Web-scale recommender system that powers over 40% of user engagement on Pinterest. This paper is a longitudinal study of three years of its development, exploring the evolution of the system and its components from prototypes to present state. Each component was originally built with many constraints on engineering effort and computational resources, so we prioritized the simplest and highest-leverage solutions. We show how organic growth led to a complex system and how we managed this complexity. Many challenges arose while building this system, such as avoiding feedback loops, evaluating performance, activating content, and eliminating legacy heuristics. Finally, we offer suggestions for tackling these challenges when engineering Web-scale recommender systems

A new approach to an old-school API

EDI was developed in the 60s. It was originally mainframe-to-mainframe communication. This format was then picked up and popularized by Wal-Mart in the 80s. Wal-Mart was one of the leaders in popularizing EDI, automating transactions with their suppliers, and giving their suppliers data feeds that showed sales velocity so that they could do upstream planning.

.. Since data was extremely expensive back then (Wal-Mart was one of the the first companies to have their own satellite system for sharing this sort of data, which was expensive to do), it is optimized for what’s known in the EDI industry as kilocharacters (a kilocharacter is 1000 characters, which is roughly a kilobyte)

 .. They tried to optimize for keeping the kilocharacter count as low as possible. It’s this complex, hierarchical flat-file. It’s very odd, but it’s all designed this way in order to minimize data usage.
Then, when Amazon came about in the 90s and early 2000s, they wanted to do business with all the companies that were supplying Wal-Mart. The fastest way to onboard those supplier was to adopt the same standard the Wal-Mart was using. Fast forward to today, everybody uses EDI.
.. EDI is a permissionless framework. As long as people can do the EDI handshake and create valid response documents in the EDI format, they generally don’t need to get permission from retailers. If someone wants to integrate via EDI with Amazon, for example, Amazon doesn’t have to approve the company that is handling the EDI integration.
.. 9 out of 10 of our beta signups are current customers of our competitors.
.. For us, using something like AWS Lambda, where we get a million requests for 20 cents, gives us a few advantages over our competitors. You can imagine what the weeks leading up to Black Friday and Cyber Monday is like for a legacy EDI company. Serverless gives us the ability to scale massively for peak demand without paying for any unused capacity.
.. The nice thing about using AWS Lambda is that if we wanted to, we could have one function in Python, one function in Java, and one function in something else. It’s easy to break everything apart.
.. There’s a famous saying, “No restaurant ever went out of business for being too small.”

How to set up world-class continuous deployment using free hosted tools

I’m going to describe a way to put together a world-class continuous deployment infrastructure for your side-project without spending any money.

With continuous deployment every code commit is tested against an automated test suite. If the tests pass it gets deployed directly to the production environment! How’s that for an incentive to write comprehensive tests?

Each of the tools I’m using offers a free tier which is easily enough to handle most side-projects. And once you outgrow those free plans, you can solve those limitations in exchange for money!

You fired your top talent. I hope you’re happy

The factory I used to work at routinely fires competent, productive engineers because they need scapegoats for management failures. The bigger the failure, the bigger the scapegoat, so sometimes it goes all the way up to director level, but that’s only once every few years. Realizing that all it would take to lose my livelihood would be for me to be in the wrong place at the wrong time was one of the big reasons I left.