#302 — APRIL 24, 2019 READ ON THE WEB
This issue’s Tip of the Week looks at SQL’s
<span style="color: #5a5a5a;">FILTER</span>clause. Scroll to the end of this issue to check it out..
.. snip ..
Tip of the Week
<span style="color: #5a5a5a;">FILTER</span>to turn entity/value tables into rows of entities
An entire article could be written on this topic, but I wanted to show off the most basic use of SQL:2003’s
<span style="color: #5a5a5a;">FILTER</span>clause that was added to Postgres 9.4.
Let’s say you have a table called
<span style="color: #5a5a5a;">props</span>that represents entities, attributes, and values with an integer
<span style="color: #5a5a5a;">id</span>column, textual attribute and value columns, and the following contents:
FILTERclause essentially adds an extra
<span style="color: #5a5a5a;">WHERE</span>clause to aggregate functions (such as
<span style="color: #5a5a5a;">MIN</span>,
<span style="color: #5a5a5a;">MAX</span>and
<span style="color: #5a5a5a;">SUM</span>) allowing you to scope them.
This comes in very handy for pulling out values from our
<span style="color: #5a5a5a;">val</span>column based upon the value of the attr column, therefore allowing us to turn a table of entities, attributes and values into a more classical set of columns.
MAX(val) FILTER(WHERE attr=’name’) AS name,
MAX(val) FILTER(WHERE attr=’age’) AS age,
MAX(val) FILTER(where attr=’city’) AS city
FROM props GROUP BY id;
You might need to reproduce this table and play with the query to get the feel of what’s going on, but essentially we are selecting each row grouped by the ID (i.e. the ID of the underlying entity) and then picking the value associated with each ID that matches certain attribute names, allowing us, in this case, to extract the
<span style="color: #5a5a5a;">name</span>,
<span style="color: #5a5a5a;">age</span>, and
<span style="color: #5a5a5a;">city</span>values.
<span style="color: #5a5a5a;">MAX</span>works fine as an aggregate function here as there is only one attr/val pair per entity.
<span style="color: #5a5a5a;">FILTER</span>clause has a lot more uses than this, but I felt this was both a pretty neat and perhaps unexpected example of its use.
You can learn more in this article and in the official Postgres documentation.
This week’s tip is sponsored by Hasura, creators of the high-performance GraphQL engine on new and existing Postgres databases. Check them out on GitHub.
Be careful with CTE in PostgreSQL
>A lesser known fact about CTE in PostgreSQL is that the database will evaluate the query inside the CTE and store the results.
PostgreSQL materialized the CTE, meaning, it created a temporary structure with the results of the query defined in the CTE, and only then applied the filter to it. Because the predicate was not applied on the table (but the CTE) PostgreSQL was unable to utilize the index on the ID column.
Unlike PostgreSQL, Oracle is not materializing CTEs by default and the two queries generate the same execution plan.
PostgreSQL’s Powerful New Join Type: LATERAL
PostgreSQL 9.3 has a new join type! Lateral joins arrived without a lot of fanfare, but they enable some powerful new queries that were previously only tractable with procedural code. In this post, I’ll walk through a conversion funnel analysis that wouldn’t be possible in PostgreSQL 9.2.
What is a LATERAL join?
The best description in the documentation comes at the bottom of the list of FROM clause options:
LATERALkey word can precede a sub-
SELECT FROMitem. This allows the sub-
SELECTto refer to columns of
FROMitems that appear before it in the
LATERAL, each sub-
SELECTis evaluated independently and so cannot cross-reference any other
LATERALcross-references, evaluation proceeds as follows: for each row of the
FROMitem providing the cross-referenced column(s), or set of rows of multiple
FROMitems providing the columns, the
LATERALitem is evaluated using that row or row set’s values of the columns. The resulting row(s) are joined as usual with the rows they were computed from. This is repeated for each row or set of rows from the column source table(s).
What are the options for storing hierarchical data in a relational database?
Generally speaking, you’re making a decision between fast read times (for example, nested set) or fast write times (adjacency list). Usually, you end up with a combination of the options below that best fit your needs. The following provides some in-depth reading:
- One more Nested Intervals vs. Adjacency List comparison: the best comparison of Adjacency List, Materialized Path, Nested Set and Nested Interval I’ve found.
- Models for hierarchical data: slides with good explanations of tradeoffs and example usage
- Representing hierarchies in MySQL: very good overview of Nested Set in particular
- Hierarchical data in RDBMSs: most comprehensive and well-organized set of links I’ve seen, but not much in the way of explanation