3 pleasantly surprising PostgreSQL Indexing tricks
Partial Indexes
On one of the projects I worked on had the following code:
Message.where(sent: false).find_each(&:process)It runs every hour or so to send enqueued messages. The
<span style="color: #444444;">Message#process</span>
method set<span style="color: #444444;">sent</span>
to<span style="color: #444444;">true</span>
after sending the message. Only new messages had<span style="color: #444444;">sent</span>
set to<span style="color: #444444;">false</span>
. The code above doesn’t care about sent messages at all.The problem is that there was an index on
<span style="color: #444444;">messages.sent</span>
. It encompassed all rows. 99% of messages were marked as sent so 99% of entries in the index were unused. The app cared only about those 1% with<span style="color: #444444;">sent == false</span>
.PostgreSQL to the rescue! We can create an index on a subset of rows that satisfy a
<span style="color: #444444;">WHERE</span>
condition. In this case:CREATE INDEX messages_sent_true ON messages(sent) WHERE NOT sent;This command will index only those rows that aren’t marked as sent.
You can create partial indexes in vanilla Rails. Just run:
add_index :messages, :sent, where: "NOT sent"When used right, partial indexes are smaller and faster than full-table indexes.