Static and dynamic pivots

How to do a dynamic pivot

So the difficulty of a dynamic pivot is: in an SQL query, the output columns must be determined before execution. But to know which columns are formed from transposing rows, we’d to need to execute the query first. To solve this chicken and egg problem, we need to loosen the constraints a bit.

Result in a nested structure inside a column

Firstly, a SQL query can return the pivoted part encapsulated inside a single column with a composite or array type, rather than as multiple columns. This type could be array[text], JSON, XML… This solution is used by Oracle with its PIVOT XML clause. That’s a one-step method, which is good, but the result has a non-tabular structure that does necessarily match the users expectations.

Here is an example using modern PostgreSQL with JSON:

<span style="color: #111111;"><span class="k">SELECT</span> <span class="n">city</span><span class="p">,</span></span>
<span style="color: #111111;">       <span class="n">json_object_agg</span><span class="p">(</span><span class="k">year</span><span class="p">,</span><span class="n">total</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="k">year</span><span class="p">)</span></span>
<span style="color: #111111;">   <span class="k">FROM</span> <span class="p">(</span></span>
<span style="color: #111111;">     <span class="k">SELECT</span> <span class="n">city</span><span class="p">,</span> <span class="k">year</span><span class="p">,</span> <span class="k">SUM</span><span class="p">(</span><span class="n">raindays</span><span class="p">)</span> <span class="k">AS</span> <span class="n">total</span></span>
<span style="color: #111111;">        <span class="k">FROM</span> <span class="n">rainfall</span></span>
<span style="color: #111111;">        <span class="k">GROUP</span> <span class="k">BY</span> <span class="n">city</span><span class="p">,</span><span class="k">year</span></span>
<span style="color: #111111;">   <span class="p">)</span> <span class="n">s</span></span>
<span style="color: #111111;">  <span class="k">GROUP</span> <span class="k">BY</span> <span class="n">city</span></span>
<span style="color: #111111;">  <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">city</span><span class="p">;</span></span>

Without having to enumerate the years in the query, we get the same data as above, but as 2 columns, the first for the “vertical axis”, and the next one for all the rest in JSON format:

<span style="color: #111111;">   city    |                                    json_object_agg                                     </span>
<span style="color: #111111;">-----------+----------------------------------------------------------------------------------------</span>
<span style="color: #111111;"> Ajaccio   | { "2012" : 69, "2013" : 91, "2014" : 78, "2015" : 48, "2016" : 81, "2017" : 51 }</span>
<span style="color: #111111;"> Bordeaux  | { "2012" : 116, "2013" : 138, "2014" : 137, "2015" : 101, "2016" : 117, "2017" : 110 }</span>
<span style="color: #111111;"> Brest     | { "2012" : 178, "2013" : 161, "2014" : 180, "2015" : 160, "2016" : 165, "2017" : 144 }</span>
<span style="color: #111111;"> Dijon     | { "2012" : 114, "2013" : 124, "2014" : 116, "2015" : 93, "2016" : 116, "2017" : 103 }</span>
<span style="color: #111111;"> Lille     | { "2012" : 153, "2013" : 120, "2014" : 136, "2015" : 128, "2016" : 138, "2017" : 113 }</span>
<span style="color: #111111;"> Lyon      | { "2012" : 112, "2013" : 116, "2014" : 111, "2015" : 80, "2016" : 110, "2017" : 102 }</span>
<span style="color: #111111;"> Marseille | { "2012" : 47, "2013" : 63, "2014" : 68, "2015" : 53, "2016" : 54, "2017" : 43 }</span>
<span style="color: #111111;"> Metz      | { "2012" : 98, "2013" : 120, "2014" : 110, "2015" : 93, "2016" : 122, "2017" : 115 }</span>
<span style="color: #111111;"> Nantes    | { "2012" : 124, "2013" : 132, "2014" : 142, "2015" : 111, "2016" : 106, "2017" : 110 }</span>
<span style="color: #111111;"> Nice      | { "2012" : 53, "2013" : 77, "2014" : 78, "2015" : 50, "2016" : 52, "2017" : 43 }</span>
<span style="color: #111111;"> Paris     | { "2012" : 114, "2013" : 111, "2014" : 113, "2015" : 85, "2016" : 120, "2017" : 110 }</span>
<span style="color: #111111;"> Perpignan | { "2012" : 48, "2013" : 56, "2014" : 54, "2015" : 48, "2016" : 69, "2017" : 48 }</span>
<span style="color: #111111;"> Toulouse  | { "2012" : 86, "2013" : 116, "2014" : 111, "2015" : 83, "2016" : 102, "2017" : 89 }</span>
<span style="color: #111111;">(13 rows)
</span>

That somehow does the job, but visually the alignment is not very good, and if we want to copy-paste this result into a spreadsheet or import it as tabular data, it’s clear that it’s not going to work.

Getting a tabular result in two steps

The other solutions based on a SQL query are based on the idea of a two-step process:

  1. a first query build the result with all its columns, and returns an indirect reference to this result.
  2. a second query returns the result set, now that its structure is known by the SQL engine thanks to the previous step.