SQL: Using Filter to turn EAV (entity/attribute/value) rows into rows of entities

#302 — APRIL 24, 2019 READ ON THE WEB
👋 This issue’s Tip of the Week looks at SQL’s <span style="color: #5a5a5a;">FILTER</span> clause. Scroll to the end of this issue to check it out.. 👀
Postgres Weekly
.. snip ..
supported by Hasura

💡 Tip of the Week

Using <span style="color: #5a5a5a;">FILTER</span> to turn entity/value tables into rows of entities

An entire article could be written on this topic, but I wanted to show off the most basic use of SQL:2003’s <span style="color: #5a5a5a;">FILTER</span> clause that was added to Postgres 9.4.

Let’s say you have a table called <span style="color: #5a5a5a;">props</span> that represents entities, attributes, and values with an integer <span style="color: #5a5a5a;">id</span> column, textual attribute and value columns, and the following contents:

The FILTER clause essentially adds an extra <span style="color: #5a5a5a;">WHERE</span> clause to aggregate functions (such as <span style="color: #5a5a5a;">MIN</span><span style="color: #5a5a5a;">MAX</span> and <span style="color: #5a5a5a;">SUM</span>) allowing you to scope them.

This comes in very handy for pulling out values from our <span style="color: #5a5a5a;">val</span> column based upon the value of the attr column, therefore allowing us to turn a table of entities, attributes and values into a more classical set of columns.

MAX(val) FILTER(WHERE attr=’name’) AS name,
MAX(val) FILTER(WHERE attr=’age’) AS age,
MAX(val) FILTER(where attr=’city’) AS city
FROM props GROUP BY id;
You might need to reproduce this table and play with the query to get the feel of what’s going on, but essentially we are selecting each row grouped by the ID (i.e. the ID of the underlying entity) and then picking the value associated with each ID that matches certain attribute names, allowing us, in this case, to extract the <span style="color: #5a5a5a;">name</span><span style="color: #5a5a5a;">age</span>, and <span style="color: #5a5a5a;">city</span> values. <span style="color: #5a5a5a;">MAX</span> works fine as an aggregate function here as there is only one attr/val pair per entity.

The <span style="color: #5a5a5a;">FILTER</span> clause has a lot more uses than this, but I felt this was both a pretty neat and perhaps unexpected example of its use.

You can learn more in this article and in the official Postgres documentation.

This week’s tip is sponsored by Hasura, creators of the high-performance GraphQL engine on new and existing Postgres databases. Check them out on GitHub.

Be careful with CTE in PostgreSQL

>A lesser known fact about CTE in PostgreSQL is that the database will evaluate the query inside the CTE and store the results.

PostgreSQL materialized the CTE, meaning, it created a temporary structure with the results of the query defined in the CTE, and only then applied the filter to it. Because the predicate was not applied on the table (but the CTE) PostgreSQL was unable to utilize the index on the ID column.

Unlike PostgreSQL, Oracle is not materializing CTEs by default and the two queries generate the same execution plan.

PostgreSQL’s Powerful New Join Type: LATERAL

PostgreSQL 9.3 has a new join type! Lateral joins arrived without a lot of fanfare, but they enable some powerful new queries that were previously only tractable with procedural code. In this post, I’ll walk through a conversion funnel analysis that wouldn’t be possible in PostgreSQL 9.2.

What is a LATERAL join?

The best description in the documentation comes at the bottom of the list of FROM clause options:

The LATERAL key word can precede a sub-SELECT FROM item. This allows the sub-SELECT to refer to columns of FROM items that appear before it in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently and so cannot cross-reference any other FROMitem.)

When a FROM item contains LATERAL cross-references, evaluation proceeds as follows: for each row of the FROM item providing the cross-referenced column(s), or set of rows of multiple FROM items providing the columns, the LATERAL item is evaluated using that row or row set’s values of the columns. The resulting row(s) are joined as usual with the rows they were computed from. This is repeated for each row or set of rows from the column source table(s).


What are the options for storing hierarchical data in a relational database?

Generally speaking, you’re making a decision between fast read times (for example, nested set) or fast write times (adjacency list). Usually, you end up with a combination of the options below that best fit your needs. The following provides some in-depth reading: