SQL: Using Filter to turn EAV (entity/attribute/value) rows into rows of entities

#302 — APRIL 24, 2019 READ ON THE WEB
👋 This issue’s Tip of the Week looks at SQL’s <span style="color: #5a5a5a;">FILTER</span> clause. Scroll to the end of this issue to check it out.. 👀
Postgres Weekly
.. snip ..
supported by Hasura

💡 Tip of the Week

Using <span style="color: #5a5a5a;">FILTER</span> to turn entity/value tables into rows of entities

An entire article could be written on this topic, but I wanted to show off the most basic use of SQL:2003’s <span style="color: #5a5a5a;">FILTER</span> clause that was added to Postgres 9.4.

Let’s say you have a table called <span style="color: #5a5a5a;">props</span> that represents entities, attributes, and values with an integer <span style="color: #5a5a5a;">id</span> column, textual attribute and value columns, and the following contents:

The FILTER clause essentially adds an extra <span style="color: #5a5a5a;">WHERE</span> clause to aggregate functions (such as <span style="color: #5a5a5a;">MIN</span><span style="color: #5a5a5a;">MAX</span> and <span style="color: #5a5a5a;">SUM</span>) allowing you to scope them.

This comes in very handy for pulling out values from our <span style="color: #5a5a5a;">val</span> column based upon the value of the attr column, therefore allowing us to turn a table of entities, attributes and values into a more classical set of columns.

MAX(val) FILTER(WHERE attr=’name’) AS name,
MAX(val) FILTER(WHERE attr=’age’) AS age,
MAX(val) FILTER(where attr=’city’) AS city
FROM props GROUP BY id;
You might need to reproduce this table and play with the query to get the feel of what’s going on, but essentially we are selecting each row grouped by the ID (i.e. the ID of the underlying entity) and then picking the value associated with each ID that matches certain attribute names, allowing us, in this case, to extract the <span style="color: #5a5a5a;">name</span><span style="color: #5a5a5a;">age</span>, and <span style="color: #5a5a5a;">city</span> values. <span style="color: #5a5a5a;">MAX</span> works fine as an aggregate function here as there is only one attr/val pair per entity.

The <span style="color: #5a5a5a;">FILTER</span> clause has a lot more uses than this, but I felt this was both a pretty neat and perhaps unexpected example of its use.

You can learn more in this article and in the official Postgres documentation.

This week’s tip is sponsored by Hasura, creators of the high-performance GraphQL engine on new and existing Postgres databases. Check them out on GitHub.

Database Modelization Anti-Patterns

The entity attribue values or EAV is a design that tries to accommodate with a lack of specifications. In our application, we have to deal with parameters and new parameters may be added at each release. It’s not clear which parameters we need, we just want a place to manage them easily, and we are already using a database server after all. So there we go:

.. The model makes it very easy to add things to it, and very difficult to make sense of the accumulated data, or to use them effectively in SQL, making it an anti-pattern. 

Multiple Values per Column

create table tweet
id bigint primary key,
date timestamptz,
message text,
tags text
Data would then be added with a semicolon separator, for instance, or maybe a pipe | char, or in some cases with a fancy Unicode separator char such as §, ¶ or ¦. Here we find a classic semicolon:
id │ date │ message │ tags
720553530088669185 │ … │ … │ #NY17
720553531665682434 │ … │ … │ #Endomondo;#endorphins
(2 rows)

Several things are very hard to do when you have several tags hidden in a text column using a separator:
Tag Search
To implement searching for a list of messages containing a single given tag, this model forces a substring search which is much less efficient than direct search.