Cursor based Pagination with multi-key cursor
Let's say for sake of discussion that we want to paginate a set of transactions to be presented in a mobile app with infinite scroll implemented. We will also assume that for sake of discussion our service is backed by PostgreSQL as our database.
We obviously want to see these transactions in relation to time. Probably the most recently created transactions first and then the older transactions as we scroll through the infinite scroll. So we will likely want to order these transactions by their timestamp in descending order.
Cursor based Pagination
Because of our use case we can use a strategy called Cursor Based pagination. In this strategy we pass a "cursor" identifying the transaction of the last item in the previous pages results and a limit to get back the next page.
Use transaction_datetime
as cursor
You might be thinking, well we have the transaction_datetime
, can't we just
use that as the "cursor". The answer is of course you could. However, if you
have timestamps that aren't super granular. This happens when dealing with
banks & transactions as you will get back a full timestamp object, but it will
only have information for the day, month, and year. This means in your dataset
you may have a decent number of transactions that are all effectively at the
same timestamp.
This is problematic because you have to either do a where condition in your
query of <
or <=
to the cursor. If we do <
we will miss out on
transactions that should be visible. If we do <=
we will end up with
duplicate transactions. Neither of these are acceptable from a user experience
standpoint.
What if we combine transaction_datetime
with another cursor
Well if we for example had a UUID for the id
column of a transaction we could
use that, as ordering of UUIDs would be consistent. This would allow us to have
a query like the following.
SELECT * FROM transactions ORDER BY transaction_datetime DESC, id DESC;
In the above we use a primary order of transaction_datetime
with a secondary
order of id
. This makes it so that if there are multiple transactions
associated with the same transaction_datetime
that they will all be
consistently ordered by the id
.
Getting the Next Page
Somehow we have to use the cursor now composed of two values
transaction_datetime
and id
to figure out where the dataset should start
for the next page of values.
If we tried to do something like the following
SELECT * FROM transactions WHERE transaction_datetime < '2021-02-23 04:23:23' AND id < 'A234U-23432aue-2343aue-aauo234'
We would have a problem because we would be filtering out transactions that are less than the datetime but greater than the UUID. In turn making the resulting set not what we want.
So instead we have to use a feature of PostgreSQL called Composite Values in the where query. I think of this as allowing us to create a tuple of values that it then knows how to compare in combination rather than individually.
SELECT * FROM transactions WHERE (transaction_datetime, id) < ('2021-02-23 04:23:23', 'A234U-23432aue-2343aue-aauo234')
With this we actually get back the correct set as we are sorting on the combined value not the individual values.