Cut Out the Middle Tier: Generating JSON Directly from Postgres

Easy JSON using row_to_json

The simplest JSON generator is row_to_json() which takes in a tuple value and returns the equivalent JSON dictionary.

SELECT row_to_json(employees)
FROM employees
WHERE employee_id = 1;

The resulting JSON uses the column names for keys, so you get a neat dictionary.

{
  "employee_id": 1,
  "department_id": 1,
  "name": "Paul",
  "start_date": "2018-09-02",
  "fingers": 10,
  "geom": {
    "type": "Point",
    "coordinates": [
      -123.329773,
      48.407326
    ]
  }
}

And look what happens to the geometry column! Because PostGIS includes a cast from geometry to JSON, the geometry column is automatically mapped into GeoJSON in the conversion. This is a useful trick with any custom type: define a cast to JSON and you automatically integrate with the native PostgreSQL JSON generators.

Full result sets using json_agg

Turning a single row into a dictionary is fine for basic record access, but queries frequently require multiple rows to be converted.

Fortunately, there’s an aggregate function for that, json_agg, which carries out the JSON conversion and converts the multiple results into a JSON list.

SELECT json_agg(e) 
FROM (
    SELECT employee_id, name 
    FROM employees
    WHERE department_id = 1
    ) e;

Note that in order to strip down the data in the record, we use a subquery to make a narrower input to json_agg.

[
  {
    "employee_id": 1,
    "name": "Paul"
  },
  {
    "employee_id": 2,
    "name": "Martin"
  }
]

The Guardians Switched from Mongo to Postgres

In April the Guardian switched off the Mongo DB cluster used to store our content after completing a migration to PostgreSQL on Amazon RDS. This post covers why and how

At the Guardian, the majority of content – including articles, live blogs, galleries and video content – is produced in our in-house CMS tool, Composer. This, until recently, was backed by a Mongo DB database running on AWS. This database is essentially the “source of truth” for all Guardian content that has been published online – approximately 2.3m content items. We’ve just completed our migration away from Mongo to Postgres SQL.

Add “Faux” Predicates: Postgres Optimization

Made by Developers and Non-Developers

The query fetches sales that were modified before 2019. There is no index on this field, so the optimizer generates an execution plan to scan the entire table.

Let’s say you have another field in this table with the time the sale was created. Since it’s not possible for a sale to be modified before it was created, adding a similar condition on the created field won’t change the result of the query. However, the optimizer might use this information to generate a better execution plan:

db=# (
  SELECT *
  FROM sale
  WHERE modified < '2019-01-01 asia/tel_aviv'
  AND   created < '2019-01-01 asia/tel_aviv';
);
                                           QUERY PLAN
--------------------------------------------------------------------------------------------------
Index Scan using sale_created_ix on sale (cost=0.44..4.52 rows=1 width=276)
  Index Cond: (created < '2019-01-01 00:00:00+02'::timestamp with time zone)
  Filter: (modified < '2019-01-01 00:00:00+02'::timestamp with time zone)

After we added the “Faux Predicate” the optimizer decided to use the index on the created field, and the query got much faster! Note that the previous predicate on the modified field is still being evaluated, but it’s now being applied on much fewer rows.

A “Faux Predicate” should not change the result of the query. It should only be used to provide more information to the optimizer that can improve the query performance. Keep in mind that the database has to evaluate all the predicates, so adding too many might make a query slower.

pgsql-http (Github)

HTTP client for PostgreSQL, retrieve a web page from inside the database.

SELECT content FROM http_get(‘http://httpbin.org/ip’);
content
—————————–
{“origin”:”24.69.186.43″} +
(1 row)

Functions
http_header(field VARCHAR, value VARCHAR) returns http_header
http(request http_request) returns http_response
http_get(uri VARCHAR) returns http_response
http_post(uri VARCHAR, content VARCHAR, content_type VARCHAR) returns http_response
http_put(uri VARCHAR, content VARCHAR, content_type VARCHAR) returns http_response
http_patch(uri VARCHAR, content VARCHAR, content_type VARCHAR) returns http_response
http_delete(uri VARCHAR) returns http_response
http_head(uri VARCHAR) returns http_response
http_set_curlopt(curlopt VARCHAR, value varchar) returns boolean
http_reset_curlopt() returns boolean
urlencode(string VARCHAR) returns text