I noticed that many articles, tutorials, and courses do not implement pagination correctly, leading to issues such as data inconsistency and decreased performance. So, in this article I showed how to implement the pagination correctly in MongoDB with the use of Aggregation Framework, as well as how to avoid common mistakes.
I’ve put a lot of effort into creating and I would be thrilled if you could take a look and provide me with any feedback you have. Your input would be greatly appreciated as I am just starting my blog.
Thanks for sharing that article. I just checked it.
I personally don’t like 2 things from that approach:
You can only fetch the next batch based on current batch, which means you have to start from the first page in order to get the second, and so on. If you want to jump directly to page 5 for example, you can not do it.
Frontend app will not get the total number of items, which is really important for UX in many applications.
I would say that article covers only specific use-case, and not the pagination in general. But it’s definitely a nice solution for the use-case it covers.
@NeNaD , when we do skip , it actually scans all the docs till it actually starts returning docs, it is bad to perform such actions on large collections
Hi Nenad! I have a participateTransaction collection. There is campaign participation transactions with given voucher codes. There is 22 milion users in the platform. The total count of records in the collection increased to 67 milion in one year. In admin panel, there is a page that provide filter and search functionalities to admin users. I am using Java, SpringBoot, MongoRepository for db operations. I am using indexing and pagination already but the query time takes about 4-6 minutes. I want to decrease this time to 30 seconds at least. I applied archive solution for older than one year records, because the participants can use the voucher codes over one year than participation. So I have to store the data in the participateTransaction collection for one year at least.
Is there a vertical solution that you think can decrease query time to 30 seconds at least ?
Hello,
Thanks for sharing such an idea
I have few questions/concerns however:
According to MongoDB documentation $facet stage doesn’t use indexes if it is the first stage in the pipeline:
If the $facet stage is the first stage in a pipeline, the stage will perform a COLLSCAN. The $facet stage does not make use of indexes if it is the first stage in the pipeline.
So it seems like the more optimal way would be to use
$match
$sort
$facet (with count stage and [{$skip: pageSize * (page-1)}, {$limit: pageSize}])
$match and $sort will benefit from indexes because of $sort + $match Sequence Optimization
And then $sort and $limit will benefit from $sort + $limit Coalescence
What do you think about such adjustments to the original pipeline with $facet stage as the first one?
In practice, the pipeline should follow the steps you described:
$match
$sort
$facet
The $facet stage should be at the end of the pipeline for optimal pagination.
In my article, I focused solely on the pagination logic and kept it simple with just the $facet stage. However, I’ll update the article to include this important optimization for a more complete solution.
Thanks for the great article!
I have a question regarding the alerts I might be getting from Mongo Atlas, as I have faced this before, which is the Query Targeting: Scanned Objects / Returned ratio, as I assume it will trigger the warning as I am scanning the documents that I want and return only one document as a result which has the data + total count.
Any idea regarding that?