3 pleasantly surprising PostgreSQL Indexing tricks

Partial Indexes

On one of the projects I worked on had the following code:

Message.where(sent: false).find_each(&:process)

It runs every hour or so to send enqueued messages. The <span style="color: #444444;">Message#process</span> method set <span style="color: #444444;">sent</span> to <span style="color: #444444;">true</span> after sending the message. Only new messages had <span style="color: #444444;">sent</span> set to <span style="color: #444444;">false</span>. The code above doesn’t care about sent messages at all.

The problem is that there was an index on <span style="color: #444444;">messages.sent</span>. It encompassed all rows. 99% of messages were marked as sent so 99% of entries in the index were unused. The app cared only about those 1% with <span style="color: #444444;">sent == false</span>.

PostgreSQL to the rescue! We can create an index on a subset of rows that satisfy a <span style="color: #444444;">WHERE</span> condition. In this case:

CREATE INDEX messages_sent_true ON messages(sent) WHERE NOT sent;

This command will index only those rows that aren’t marked as sent.

You can create partial indexes in vanilla Rails. Just run:

add_index :messages, :sent, where: "NOT sent"

When used right, partial indexes are smaller and faster than full-table indexes.