#302 — APRIL 24, 2019 READ ON THE WEB
This issue’s Tip of the Week looks at SQL’s
FILTERclause. Scroll to the end of this issue to check it out..
.. snip ..
Tip of the Week
FILTERto turn entity/value tables into rows of entities
An entire article could be written on this topic, but I wanted to show off the most basic use of SQL:2003’s
FILTERclause that was added to Postgres 9.4.
Let’s say you have a table called
propsthat represents entities, attributes, and values with an integer
idcolumn, textual attribute and value columns, and the following contents:
FILTERclause essentially adds an extra
WHEREclause to aggregate functions (such as
SUM) allowing you to scope them.
This comes in very handy for pulling out values from our
valcolumn based upon the value of the attr column, therefore allowing us to turn a table of entities, attributes and values into a more classical set of columns.
MAX(val) FILTER(WHERE attr=’name’) AS name,
MAX(val) FILTER(WHERE attr=’age’) AS age,
MAX(val) FILTER(where attr=’city’) AS city
FROM props GROUP BY id;
You might need to reproduce this table and play with the query to get the feel of what’s going on, but essentially we are selecting each row grouped by the ID (i.e. the ID of the underlying entity) and then picking the value associated with each ID that matches certain attribute names, allowing us, in this case, to extract the
MAXworks fine as an aggregate function here as there is only one attr/val pair per entity.
FILTERclause has a lot more uses than this, but I felt this was both a pretty neat and perhaps unexpected example of its use.
The entity attribue values or EAV is a design that tries to accommodate with a lack of specifications. In our application, we have to deal with parameters and new parameters may be added at each release. It’s not clear which parameters we need, we just want a place to manage them easily, and we are already using a database server after all. So there we go:
.. The model makes it very easy to add things to it, and very difficult to make sense of the accumulated data, or to use them effectively in SQL, making it an anti-pattern.
Multiple Values per Column
create table tweet
id bigint primary key,
Data would then be added with a semicolon separator, for instance, or maybe a pipe | char, or in some cases with a fancy Unicode separator char such as §, ¶ or ¦. Here we find a classic semicolon:
id │ date │ message │ tags
720553530088669185 │ … │ … │ #NY17
720553531665682434 │ … │ … │ #Endomondo;#endorphins
Several things are very hard to do when you have several tags hidden in a text column using a separator:
To implement searching for a list of messages containing a single given tag, this model forces a substring search which is much less efficient than direct search.