Beating Uber with a PostgreSQL prototype

The other day I got a link to an interesting post published by Uber, which has caught our attention here at Cybertec: https://eng.uber.com/go-geofence
The idea behind geo-fencing is to provide information about an area to users. Somebody might want to find a taxi near a certain location or somebody might simply want to order a Pizza from a nearby restaurant.

According to the blog post Uber has solved this problem using some hand-made GO code. Uber’s implementation: 95% <5ms.
Another Blog also had an eye on it: https://medium.com/@buckhx/unwinding-uber-s-most-efficient-service-406413c5871d#.vdsg0fhoi

Of course, to a team of PostgreSQL professionals, 5 ms is quite a lot so we tried to do better than that.

PostgreSQL Scalability: Towards Millions TPS

PostgreSQL scalability on multicore and multisocket machines became a subject of optimization long time ago once such machines became widely used. This blog post shows brief history of vertical scalability improvements between versions 8.0 and 8.4. PostgreSQL 9.2 had very noticeable scalability improvement. Thanks to fast path locking and other optimizations it becomes possible to achieve more than 350 000 TPS in select-only pgbench test. The latest stable release PostgreSQL 9.5 also contain significant scalability advancements including LWLock improvement which allows achieving about 400 000 TPS in select-only pgbench test.

 

.. It’s HP ProLiant DL580 Gen9 with 3 TB of RAM.

.. It was 100% CPU on each core. Waiting on bus is still reported as CPU busy. Thus, we can’t say it’s not memory bandwidth limited.
I expect slightly higher numbers with clients on network, because pgbench consumes some resources itself. But those numbers could be achieved with higher number of clients due to network latency.

Proprietary Postgres Forks: Keeping Code Forks Current is Hard

If we look at the dates listed in the Wiki, Greenplum is listed as starting in 2005. They’re not kidding, and unfortunately it seems Pivotal executed a “fork it and forget it” maneuver. The documentation admits Greenplum is based on Postgres 8.2, with elements of functionality from 8.3.

Like Amazon’s Redshift, this immediately disqualifies Greenplum from consideration for anyone using a newer version. Our own databases are on 9.4 pending an upgrade plan; there’s no way we could justify such a massive downgrade, even for horizontal scaling improvements. EnterpriseDB had a similar problem when they started selling their version of 8.3; they were far behind for years before they managed to reduce their version lag by only a few months. Greenplum never even bothered.

This may be an amazing product, but we can’t use it to replace existing Postgres 9.4 databases that need scaling. Will Greenplum catch up now that it’s been open-sourced? I can’t say. It would definitely be cool, but I’m not holding my breath. Incidentally, this is one of the reasons all of those projects on the Wiki have definitive end dates. Keeping up with Postgres after forking is extremely difficult if you don’t merge your enhancements back into core. It’s all too easy to fall hopelessly behind and become nothing but an academic concern.