There is a lot to say about parallelism in PostgreSQL. We have come a long way since I wrote my first post on this topic (Parallel Sequential Scans). Each of the past three releases (including PG-11, which is in its beta) have a parallel query as a major feature which in itself says how useful is this feature and the amount of work being done on this feature. You can read more about parallel query in the PostgreSQL docs or in a blog post on this topic by my colleague Robert Haas. The intent of this blog post is to talk about parallel index scans on btree-indexes, a feature released in PostgreSQL 10.
To demonstrate how the feature works, here is an example of TPC-H Q-6 at scale factor - 20 (which means an approximately 20GB database). Q6 is a forecasting revenue change query. This query quantifies the amount of revenue increase that would have resulted from eliminating certain company-wide discounts in a given percentage range in a given year. Asking this type of "what if" query can be used to look for ways to increase revenues.
The basic idea is quite similar to parallel heap scans where each worker (including leader whenever possible) will scan a block (all the tuples in a block) and then get the next block that is required to be scanned. The parallelism is implemented at the leaf level of a btree. The first worker to start a btree scan will scan until it reaches the leaf and others will wait until the first worker has reached the leaf. Once the first worker reads the leaf block, it sets the next block to be read and wakes one of the workers waiting to scan blocks. Further, it proceeds scanning tuples from the block it has read. Henceforth, each worker after reading a block, sets the next block to be read and wakes up the next waiting worker. This continues until no more pages are left to scan at which we end the parallel scan and notify all the workers.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.