Cut Out the Middle Tier: Generating JSON Directly from Postgres

Easy JSON using row_to_json

The simplest JSON generator is row_to_json() which takes in a tuple value and returns the equivalent JSON dictionary.

SELECT row_to_json(employees)
FROM employees
WHERE employee_id = 1;

The resulting JSON uses the column names for keys, so you get a neat dictionary.

{
  "employee_id": 1,
  "department_id": 1,
  "name": "Paul",
  "start_date": "2018-09-02",
  "fingers": 10,
  "geom": {
    "type": "Point",
    "coordinates": [
      -123.329773,
      48.407326
    ]
  }
}

And look what happens to the geometry column! Because PostGIS includes a cast from geometry to JSON, the geometry column is automatically mapped into GeoJSON in the conversion. This is a useful trick with any custom type: define a cast to JSON and you automatically integrate with the native PostgreSQL JSON generators.

Full result sets using json_agg

Turning a single row into a dictionary is fine for basic record access, but queries frequently require multiple rows to be converted.

Fortunately, there’s an aggregate function for that, json_agg, which carries out the JSON conversion and converts the multiple results into a JSON list.

SELECT json_agg(e) 
FROM (
    SELECT employee_id, name 
    FROM employees
    WHERE department_id = 1
    ) e;

Note that in order to strip down the data in the record, we use a subquery to make a narrower input to json_agg.

[
  {
    "employee_id": 1,
    "name": "Paul"
  },
  {
    "employee_id": 2,
    "name": "Martin"
  }
]

The Guardians Switched from Mongo to Postgres

In April the Guardian switched off the Mongo DB cluster used to store our content after completing a migration to PostgreSQL on Amazon RDS. This post covers why and how

At the Guardian, the majority of content – including articles, live blogs, galleries and video content – is produced in our in-house CMS tool, Composer. This, until recently, was backed by a Mongo DB database running on AWS. This database is essentially the “source of truth” for all Guardian content that has been published online – approximately 2.3m content items. We’ve just completed our migration away from Mongo to Postgres SQL.

merging many json files into one

jq solution:

jq -s '{ attributes: map(.attributes[0]) }' file*.json
  • -s (--slurp) – instead of running the filter for each JSON object in the input, read the entire input stream into a large array and run the filter just once.

Sample output:

{
  "attributes": [
    {
      "name": "Node",
      "value": "test"
    },
    {
      "name": "version",
      "value": "11.1"
    }
  ]
}

How can I import a JSON file into PostgreSQL?

You can feed the JSON into a SQL statement that extracts the information and inserts that into the table. If the JSON attributes have exactly the name as the table columns you can do something like this:

<span class="kwd">with</span><span class="pln"> customer_json </span><span class="pun">(</span><span class="pln">doc</span><span class="pun">)</span> <span class="kwd">as</span> <span class="pun">(</span>
   <span class="kwd">values</span> 
    <span class="pun">(</span><span class="str">'[
      {
        "id": 23635,
        "name": "Jerry Green",
        "comment": "Imported from facebook."
      },
      {
        "id": 23636,
        "name": "John Wayne",
        "comment": "Imported from facebook."
      }
    ]'</span><span class="pun">::</span><span class="pln">json</span><span class="pun">)</span>
<span class="pun">)</span>
<span class="kwd">insert</span> <span class="kwd">into</span><span class="pln"> customer </span><span class="pun">(</span><span class="pln">id</span><span class="pun">,</span><span class="pln"> name</span><span class="pun">,</span><span class="pln"> comment</span><span class="pun">)</span>
<span class="kwd">select</span><span class="pln"> p</span><span class="pun">.*</span>
<span class="kwd">from</span><span class="pln"> customer_json l
  </span><span class="kwd">cross</span> <span class="kwd">join</span><span class="pln"> lateral json_populate_recordset</span><span class="pun">(</span><span class="kwd">null</span><span class="pun">::</span><span class="pln">customer</span><span class="pun">,</span><span class="pln"> doc</span><span class="pun">)</span> <span class="kwd">as</span><span class="pln"> p
</span><span class="kwd">on</span><span class="pln"> conflict </span><span class="pun">(</span><span class="pln">id</span><span class="pun">)</span><span class="pln"> do </span><span class="kwd">update</span> 
  <span class="kwd">set</span><span class="pln"> name </span><span class="pun">=</span><span class="pln"> excluded</span><span class="pun">.</span><span class="pln">name</span><span class="pun">,</span><span class="pln"> 
      comment </span><span class="pun">=</span><span class="pln"> excluded</span><span class="pun">.</span><span class="pln">comment</span><span class="pun">;</span>

New customers will be inserted, existing ones will be updated. The “magic” part is the <span style="color: #242729;">json_populate_recordset(null::customer, doc)</span> which generates a relational representation of the JSON objects.