postgres

POSTGRESQL TOOLS FOR THE VISUALLY INCLINED

In the case of PostgreSQL, I’ve quickly come to the conclusion that bad tooling is one of the main reasons the uptake is so much more difficult and convoluted coming from the SQL Server community. Even the devs I’m currently working with that have no specific affinity for databases at all recognize that PostgreSQL just feels like more of a black box then the limited experience they had previously with SQL Server.

Let me get right to it: jumping from SQL Server to PostgreSQL is much morethan changing a tool. PostgreSQL was built on Unix, with Unix in mind as the platform of choice, and typically runs best when it’s sitting on some type of Unix box. The Unix world has a pretty specific idiom for how to go about things and it certainly isn’t visual!

As someone who learned to code visually, I had to learn what each icon meant and the visual cues for what happens where. I came to understand property pains, the lines under the text of a button that described shortcuts, and the idiomatic layout of each form. Executing a command meant pressing a button.

In the Unix world you write out that command. The check boxes and dialogs are replaced by option flags and arguments. You install the tools you need and then look for the binaries that help you do a thing, then you interrogate them for help, typically using a --helpcommand (or just plain help).

The same is true for PostgreSQL. This is the thing that I think was stumping Ryan. He’s searching for visual tooling in a world that embraces a completely different idiom. It’s like going to Paris and disliking it (and France) because the barbecue is horrible.

I’m using Mac’s Terminal app but you can use any shell you like, including Powershell and the Windows command line. I would strongly urge you, however, to crack open a Linux VM or Docker to get the “flavor” of working with PostgreSQL. You can, indeed, find barbecue in Paris but it might help to explore the local cuisine.

How to setup Windows Active Directory with PostgreSQL GSSAPI Kerberos Authentication

The first step in setting up a Windows Active Directory is to create a regular user account. The password can be anything but shouldn’t expire and it needs to be unique in the environment. In this instance, we’ll use pg1postgres.

Once the user account exists, we have to create a mapping between that user account and the service principal and create a keytab file. These steps can be combined using the Windows ktpass command, like so:

Citus: Scale-Out Clustering and Sharding for PostgreSQL

Citus runs on standard, unpatched PostgreSQL servers. The only modification is installing the extensions into the server. This is a unique and extremely important advantage: most clustered databases that are derived from another database inevitably lag behind and get stuck on an old version of the original database, unable to keep up with the grueling workload of constantly refactoring to build on new releases. Not so for Citus, which doesn’t fork Postgres—it extends it with Postgres’s own extension mechanisms. This means that Citus is positioned to continue innovating on its own software, while continuing to benefit from the strong progress that the PostgreSQL community is delivering on a regular cadence too.

Loading Terabytes of Data From Postgres Into BigQuery

To load data into BigQuery we’re going to use BigQuery CLI, which is a very versatile tool. You can install it using these instructions. As we’re on Linux, we’ll be using bash script in order to perform all the work. I assume BigQuery CLI is installed and authorized.

Let’s create bigquery-upload.sh and add the following function in order to upload a single day from a specific table:
#!/bin/bash function upload_day { table=$1 sel=$2 day=$3 next_day=$(date -d "$day+1 days" +%Y-%m-%d) bq_suffix=$(date -d "$day" +%Y%m%d) echo "Uploading $table: $day..." psql -c "\\copy (select $sel from $table where created_at >= '$day' and created_at < '$next_day') TO '$table-$day.csv' WITH CSV HEADER" gzip $table-$day.csv bq load --allow_quoted_newlines --project_id --replace --source_format=CSV --autodetect --max_bad_records 100 .$table$bq_suffix $table-$day.csv.gz rm $table-$day.csv.gz };

This function has three arguments: table, columns for selection, and date to upload. As you can see, it uses the \copy operation to download a CSV from Postgres and then compresses it. The BigQuery docs say the loading of a compressed CSV is slower than uncompressed, but uploading uncompressed data almost always seems slower.

You can call this function by simply adding a line at the end of the script:

upload_day ‘transactions’ ‘*’ ‘2018-03-01’