Blog

My thoughts and experiments.

© 2023. Dmitry Dolgov All rights reserved.

Query optimizer and chess?

Do not be afraid, this short blog post is about databases and does not contain any unreasonable metaphysical references. In fact, it’s a result of a journey through couple of whitepapers and books with an unexpected intersection of two rather different fields. I will try to describe everything step by step, so that we can see if chess has anything in common with database query optimizer.

PostgreSQL at low level: stay curious!

0. How to read me?

Yes, I know, it’s a long text, and it was my conscious decision to write it in this way. But fear not! Imagine that you read a book, take a look at the introduction and first few interesting sections, think about it and then find time to read further. I hope I’ve left enough references, so if you don’t get some ideas you’ll be able to read more information about interesting parts. Or you can even skip some sections, since they are relatively independent. This table of contents will guide you:

1. Introduction

In mathematics it is too easy to concentrate very hard on one specific problem. If your methods are not good enough to handle the problem, you will eventually grind to a halt, baffled and defeated. Often the key to further progress is to stand back, forget about the special problem, and see if you can spot any general features of the surrounding area which might be of use.

Ian Stewart, Concepts of Modern Mathematics.

It’s not a secret that databases are damn complicated systems. And they tend to run on top of even more complicated stacks of software. Nowadays you will not surprise anyone (or at least not that much) by running your database on a Kubernetes cluster or inside a virtual machine. It’s probably still questionable whether it’s good and appropriate, but this approach is something we have to face — sometimes it’s at least convenient, sometimes it allows to be more resource efficient and sometimes it’s the only available infrastructure in a company.

Jsonb: few more stories about the performance

As such, there’s really no “standard” benchmark that will inform you about the best technology to use for your application. Only your requirements, your data, and your infrastructure can tell you what you need to know.

For already some time I can’t stop doing interesting/useful/weird (one at the time) benchmarks to reveal some details on how to apply document-oriented approach in the world of relational databases. Finally, I decided that I have a critical mass of those details to share in the form of blog post. So welcome to The Benchmark Club, where we’re going to discuss what it takes to create a fair performance comparison of different databases. As you may guess, the first rule of The Benchmark Club is to never share a reproducible benchmarks. But we identify ourselves as a badass engineers, so we’re going to break this rule today.

How to convert your data to jsonb?

“How to start” is always a difficult question, and jsonb isn’t an exception. Here are few notes about converting different types of data into jsonb, that someone can find useful.

Basically there are three possible cases of data conversion:

  • Convert data from inside PostgreSQL
  • Convert data from other database
  • Convert plain data outside database