You can see how close you are to autovacuum freeze anywhere with this query:
select max(age(datfrozenxid)) from pg_database;
However, you don’t generally really care about that; what you care about is “how close am I to forced autovacuum freeze on a large table?” Here’s that:
SELECT relname, age(relfrozenxid) as xid_age,
pg_size_pretty(pg_table_size(oid)) as table_size
FROM pg_class
WHERE relkind = ‘r’ and pg_table_size(oid) > 1073741824
ORDER BY age(relfrozenxid) DESC LIMIT 20;More specifically, that says “give me the top 20 tables over 1GB, sorted by the age of their oldest XID”. Results look something like this:
relname | xid_age | table_size
————————+———–+————
postgres_log | 199785216 | 12 GB
statements | 4551790 | 1271 MB
normal_statement_times | 31 | 12 GB
Postgres: Horizontalal Sharding
How should I shard my databases?
This is entirely dependent on the access patterns of your application. A good rule, though, is to look at your indexes. If every query goes through an index on
:user_id
, then chances are that you should shard on:user_id
. If half of your queries go through:user_id
and the other half go through:job_id
, then you may need to create two sets of shards, each with its own model, and have your application write to both.
Tuning Filesystem for Postgres (page 63/101)
Turns out default ZFS block size is 128 Kb, and Postgres page size is 8K
Every small write that touched a page, had to write 128 Kb of a ZFS block to the disk
This may be good for huge sequential writes, but not for random access, lots of tiny writes
Scaling PostgreSQL Performance Using Table Partitioning
Partition Function
A partition functionis a Stored Procedure that determines which child table should accept a new record. The master table has a trigger which calls a partition function. There are two typical methodologies for routing records to child tables: