29
.
04
.
2024
25
.
03
.
2024
Backend
Postgresql

Indexing partitioned table - Postgres Stories

Jarosław Kowalewski
Ruby Developer

Today, I want to cover short topic - indexing partitioned tables. Indexes are inseparable part of every RDBMS and allow us to get our data quicker, faster and more efficiently.

Just like with typical PostgreSQL tables, indexes can be applied to partitioned tables. Moreover, as with regular tables, this can be done non-concurrently (usually for new tables) or concurrently (for existing ones). Let’s take a look at both methods.

New table - new indexes

For a newly created partitioned table it is a piece of cake - everything looks exactly as for classical table. Let’s assume there is a partitioned table users with couple created partitions. Partitioning type is not important here. I want to add B-tree index on email column on users table, what actually leads to apply index on every partition of this table.

CREATE INDEX idx_users_email ON users (email);

Its really quick! The index, once applied to the parent table, automatically propagates over the child tables. Additionally, any partitions added in the future will automatically receive all necessary indexes.

However, there is one downside to this convenient approach. It cannot be done concurrently, meaning for existing tables, especially those with large data sets, it's not usable. If tried, the table will be locked while the index is being applied and it may take… long.

Existing tables like concurrency

Luckily, for existing partitioned tables it is possible to apply concurrent solution which doesn’t lock table forever. There's a simple workaround to enable concurrency in this case.

One way

CREATE INDEX idx_users_email ON ONLY users (email);

CREATE INDEX CONCURRENTLY idx_users_email_1
    ON users_1 (email);
ALTER INDEX idx_users_email
    ATTACH PARTITION idx_users_email_1;

CREATE INDEX CONCURRENTLY idx_users_email_2
    ON users_2 (email);
ALTER INDEX idx_users_email
    ATTACH PARTITION idx_users_email_2;

// repeat for all partitions

SELECT * FROM pg_index WHERE pg_index.indisvalid = false;

In the first step, we create an index on the parent table using the ONLY option, which serves as a marker for indexing this table. However, this index is initially invalid and cannot be used effectively for querying the table.

To rectify this, we need to create indexes separately for each partition. This operation can be performed concurrently. Each index then needs to be attached to the index in the parent table.

If we are not 100% how many partitions belongs to the table, we can list them via psql:

\d+ users;

As a result psql client should print similar info to this:

Partition key: LIST (column)
Partitions: users_1 FOR VALUES IN ('value1'),
                        users_2 FOR VALUES IN ('value2'),
                        users_3 FOR VALUES IN ('value3')
                        ...

Once all indexes from partitions are correctly assigned, the index for the partitioned table should be marked as valid. You can confirm this by running a final SELECT query on the pg_index table. This will return any invalid indexes present in the database.

…Or another

It is also possible to try a slightly different approach - firstly create indexes on partitions and then create main index on parent table:

CREATE INDEX CONCURRENTLY idx_users_email_1
    ON users_1 (email);
CREATE INDEX CONCURRENTLY idx_users_email_2
    ON users_2 (email);

// repeat for all partitions

CREATE INDEX idx_users_email ON users (email);

You can choose to skip the ONLY option, as indexes on corresponding fields in partitions should automatically be assigned to the main index. However, the first solution is safer (for instance, if you missed an index in any partition) and is recommended by the PostgreSQL documentation.

And that’s it! As you can see, indexing partitioned tables is not very complicated - especially, comparing to everything else, when it comes to partitioning. Whether it's a smooth process for new tables or a bit tricky for the old ones, the end goal is a well-optimized, finely-tuned table that can handle your queries efficiently. It's a simple but crucial step in PostgreSQL's performance playbook.

Articles in this series

Jarosław Kowalewski
Ruby Developer

Check my Twitter

Check my Linkedin

Did you like it? 

Sign up To VIsuality newsletter

READ ALSO

Ula Sołogub - SQL Injection in Ruby on Rails

The Deadly Sins in RoR security - SQL Injection

14
.
11
.
2023
Urszula Sołogub
Backend
Ruby on Rails
Software
Michal - Highlights from Ruby Unconf 2024

Highlights from Ruby Unconf 2024

14
.
11
.
2023
Michał Łęcicki
Conferences
Visuality
Cezary Kłos - Optimizing Cloud Infrastructure by $40 000 Annually

Optimizing Cloud Infrastructure by $40 000 Annually

14
.
11
.
2023
Cezary Kłos
Backend
Ruby on Rails

Smooth Concurrent Updates with Hotwire Stimulus

14
.
11
.
2023
Michał Łęcicki
Hotwire
Ruby on Rails
Software
Tutorial

Freelancers vs Software house

14
.
11
.
2023
Michał Krochecki
Visuality
Business

Table partitioning in Rails, part 2 - Postgres Stories

14
.
11
.
2023
Jarosław Kowalewski
Backend
Postgresql
Ruby on Rails

N+1 in Ruby on Rails

14
.
11
.
2023
Katarzyna Melon-Markowska
Ruby on Rails
Ruby
Backend

Turbo Streams and current user

29
.
11
.
2023
Mateusz Bilski
Hotwire
Ruby on Rails
Backend
Frontend

Showing progress of background jobs with Turbo

14
.
11
.
2023
Michał Łęcicki
Ruby on Rails
Ruby
Hotwire
Frontend
Backend

Table partitioning in Rails, part 1 - Postgres Stories

14
.
11
.
2023
Jarosław Kowalewski
Postgresql
Backend
Ruby on Rails

Table partitioning types - Postgres Stories

14
.
11
.
2023
Jarosław Kowalewski
Postgresql
Backend

Indexing partitioned table - Postgres Stories

14
.
11
.
2023
Jarosław Kowalewski
Backend
Postgresql
SQL Views in Ruby on Rails

SQL views in Ruby on Rails

14
.
11
.
2023
Jan Grela
Backend
Ruby
Ruby on Rails
Postgresql
Design your bathroom in React

Design your bathroom in React

14
.
11
.
2023
Bartosz Bazański
Frontend
React
Lazy Attributes in Ruby - Krzysztof Wawer

Lazy attributes in Ruby

14
.
11
.
2023
Krzysztof Wawer
Ruby
Software

Exporting CSV files using COPY - Postgres Stories

14
.
11
.
2023
Jarosław Kowalewski
Postgresql
Ruby
Ruby on Rails
Michał Łęcicki - From Celluloid to Concurrent Ruby

From Celluloid to Concurrent Ruby: Practical Examples Of Multithreading Calls

14
.
11
.
2023
Michał Łęcicki
Backend
Ruby
Ruby on Rails
Software

Super Slide Me - Game Written in React

14
.
11
.
2023
Antoni Smoliński
Frontend
React
Jarek Kowalewski - ILIKE vs LIKE/LOWER - Postgres Stories

ILIKE vs LIKE/LOWER - Postgres Stories

14
.
11
.
2023
Jarosław Kowalewski
Ruby
Ruby on Rails
Postgresql

A look back at Friendly.rb 2023

14
.
11
.
2023
Cezary Kłos
Conferences
Ruby

Debugging Rails - Ruby Junior Chronicles

14
.
11
.
2023
Piotr Witek
Ruby on Rails
Backend
Tutorial

GraphQL in Ruby on Rails: How to Extend Connections

14
.
11
.
2023
Cezary Kłos
Ruby on Rails
GraphQL
Backend
Tutorial

Tetris on Rails

17
.
03
.
2024
Paweł Strzałkowski
Ruby on Rails
Backend
Frontend
Hotwire

EURUKO 2023 - here's what you've missed

14
.
11
.
2023
Michał Łęcicki
Ruby
Conferences