Tuning PostgreSQL Database Parameters to Optimize Performance

The recommended value is 25% of your total machine RAM. You should try some lower and higher values because in some cases we achieve good performance with a setting over 25%. The configuration really depends on your machine and the working data set. If your working set of data can easily fit into your RAM, then you might want to increase the shared_buffer value to contain your entire database, so that the whole working set of data can reside in cache. That said, you obviously do not want to reserve all RAM for PostgreSQL.

Replication from Percona Server for MySQL to PostgreSQL using pg_chameleon

Replication is one of the well-known features that allows us to build an identical copy of a database. It is supported in almost every RDBMS. The advantages of replication may be huge, especially HA (High Availability) and load balancing. But what if we need to build replication between 2 heterogeneous databases like MySQL and PostgreSQL? Can we continuously replicate changes from a MySQL database to a PostgreSQL database? The answer to this question is pg_chameleon.

For replicating continuous changes, pg_chameleon uses the mysql-replication library to pull the row images from MySQL, which are transformed into a jsonb object. A pl/pgsql function in postgres decodes the jsonb and replays the changes into the postgres database. In order to setup this type of replication, your mysql binlog_format must be “ROW”.

Fast Full-Text Search in PostgreSQL

Progress isn’t made by early risers. It’s made by lazy men trying to find easier ways to do something.

– Robert A. Heinlein

This is especially true when discussing databases. Often when discussing text search, the first thing that comes to mind is ElasticSearch – indeed it’s a great product, works well, but can often be a pain to setup and maintain. PostgreSQL in contrast dead simple to set up, runs anywhere, is easy to maintain and probably is “good enough”.

Why Full-Text Search in PostgreSQL

It’s easy to setup, maintain, and there’s already an effective deployment pattern in companies.

I run a company called MetaCortex, where all of our products are focused on understanding how people think. Almost exclusively, our processed data[1] is stored in PostgreSQL databases. Our website ProjectPiglet.com, for instance, uses it exclusively – even though daily we process tens of thousands of comments, with millions of database inserts & reads.

It’s often said, that there are better options for full-text search and technically, that’s true! However, pragmatism is often an engineers best friend and PostgreSQL is easy for us – as the option is almost always available.

SQL WITH Statement: INSERTT with RETURNING

with new_order as(
  insert into orders(email, total) 
  values ('rob@bigmachine.io',100.00) 
  returning *
), new_items as (
  insert into order_items(order_id, sku, price)
  select new_order.id, 'imposter-single',30.00
  from new_order
  returning *
), new_downloads as (
  insert into downloads(order_id, order_item_id)
  select new_order.id, new_items.id 
  from new_order, new_items
  returning *
)

select * from new_downloads;

I tack on a returning * from my insert statement for order_items and then I can use that to generate the downloads in a third query, this time using a select for the insert.