A Tale on Concatenated Indexes: Master Roshi and Goku’s fireside chat

Indexes make our queries run as fast as a cheetah!

Right Practice. Right Results

This post is part of newsletter that I run “Scamming the Coding Interview”, which is geared towards continuous practice on concepts to ace the engineering interviews.


Once upon a time there was a master named Roshi whose relational database used to work like a blowing wind.

Reads used to be so fast that it reminded people of thunder strike. One day his disciple Goku got hold of him and asked, my database is smaller than yours, then why my queries are slower. What’s it that you know and I don’t?

To which Master Roshi replied:

Databases are really good to put data, not so much when it comes to reading it back. As database sizes start to grow, and queries start to take what seems like ages, you need to harness the power of “Concatenated Indexes”.

When queries are slow and you see a "SEQSCAN" in the explain analyze, you should know what needs to be done?

Use the index Goku!

Concatenated indexes gives you the power to create indexes according to the access patterns of your application. Thus it’s really important that we understand the data access patterns that our application will employ to create effective concatenated indexes.

Goku was amused! He knew he needed to understand all about concatenated indexes to leverage them to its fullest.
Goku asked the master, how does concatenated indexes work and how can I leverage them too?

What are concatenated indexes?

Master Roshi started sharing his wisdom acquired by maintaining large scale production databases:

The indexes that the database automatically creates consist of single key-value pair. This is not sufficient if we need to query multiple-columns. The most common type of multi-column index is called a “Concatenated Index”, which combines several keys into one by simply appending columns to one another.

This also means that the order of columns in a multi-column index is really important to ensure that they are used to their fullest.

A concatenated index is just a B-Tree index which keeps the data in a sorted list. The database considers each column according to its position in the index definition to sort the index entries. The first column is the primary sort criteria, and only if the first column as the same value the second column is used to sort the entries and so on.

Seeing that Goku was listening closely, Master Roshi gave an easy example:

It’s same as we search for an entry in a phone book by last name and then first name.
It should be noted that first name alone won’t be useful in searching in the index, but last name alone can still give results using the index.

Manuscripts of Master Roshi: Names arranged using last name and first name index.

Goku knew the master was on to something, so he followed up…

How do concatenated indexes work?

Master Roshi continued…

Concatenated indexes are generally implemented using a B-Tree, where key is made up of columns in order as define in the index.

CREATE INDEX idx_people_names ON people (last_name, first_name);

Here’s how a node in the B-Tree index looks like the postgres.

How a B-Tree index node looks like.

IndexTupleData: Contains pointer to either the next Node or to the data in the disk.
Bitmap: Stores if any of the index attributes are Null to save sapce.
Value: Contains the actual value of the column

Here’s how a B-Tree index will look like for index made on last name and first name.

Master Roshi’s manuscript showing the B-Tree for a last name and first name index.

Seeing the most fruitful tree ever, Goku asked the master to help him with an example search query.

How search queries are served using concatenated indexes:

Master Roshi continued:
Suppose we want to search for the name John Wick in the concatenated B-Tree index, here’s how the searching will proceed.

  1. We use the last name to get to the nodes that contain that last name.
  2. After that first name is used to furthur find the target node.

It seems clear now, why the order of columns in the index definition is important.

How to harness the power of concatenated indexes?

Master Roshi finally wanted to leave Goku with some more lessons to effectively employ concatenated indexes in his DB.

  1. Check which queries are frequent and take time in the database.
  2. Use explain analyze to study the query plan.
  3. This will help you understand the DB access patterns in your application.
  4. Pay special attention to the column order when creating concatenated indexes so that they can be reused.
    In a concatenated index, the first column can be searched for using the same index, the first two columns can be searched for using the same index etc

Conclusion

Goku was enlightened with the new knowledge that was indexed on him. Master Roshi was happy seeing that Goku applied the principles he taught him to make his read queries blazing fast and IOPS in control.
Goku knew that now he’ll be able to handle large databases without pre-mature partitioning.


Right Practice. Right Results

This post is part of newsletter that I run “Scamming the Coding Interview”, which is geared towards continuous practice on concepts to ace the engineering interviews.

Advertisements

One thought on “A Tale on Concatenated Indexes: Master Roshi and Goku’s fireside chat

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.