PostgreSQL, Aggregates and Histograms

Let’s have a look at our dataset from the NBA games and statistics, and get back to counting rebounds in the drb field. A preliminary query informs us that we have stats ranging from 10 to 54 rebounds per team in a single game, a good information we can use in the following query:

  select width_bucket(drb, 10, 54, 9) as buckets,
         count(*)
    from team_stats
group by buckets
order by buckets;

We asked for 9 separations so we have 10 groups as a result:

 width_bucket | count 
--------------+-------
            1 |    52
            2 |  1363
            3 |  8832
            4 | 20917
            5 | 20681
            6 |  9166
            7 |  2093
            8 |   247
            9 |    20
           10 |     1
(10 rows)

Cool Stuff in PostgreSQL 10: Auto-logging

We started off by creating a logging infrastructure, then arranging for a single table to use it.

Rather than repeat that work for each table, let’s use a relatively although not completely new feature: EVENT TRIGGER. The idea here is that we fire a trigger on CREATE TABLE and see to it that the table is logged. We’ll write the trigger first, even though in reality, we’d need to load the function it calls first.

  1. CREATE EVENT TRIGGER add_logger
  2. ON ddl_command_end
  3. WHEN tag IN ('create table')
  4. EXECUTE PROCEDURE add_logger();
  5. COMMENT ON EVENT TRIGGER add_logger IS 'Ensure that each table which is not a log gets logged';

The magic happens inside add_logger(), but it’s magic we’ve already seen. First, we’ll get the table’s name and schema using pg_event_trigger_ddl_commands(), filtering out tables which are already log tables. The test here is crude and string-based, but we could easily go to schema-based ones.


						

PostgreSQL’s LATERAL JOIN

SELECT user_id, first_order_time, next_order_time, id FROM
(SELECT user_id, min(created_at) AS first_order_time FROM orders GROUP BY user_id) o1
LEFT JOIN LATERAL
(SELECT id, created_at AS next_order_time
FROM orders
WHERE user_id = o1.user_id AND created_at > o1.first_order_time
ORDER BY created_at ASC LIMIT 1)
o2 ON true;