Low Paged Query Efficiency in Massive Data? An In-depth Look at AnalyticDB Deep Paging Optimization Solution

This article examines the deep paging optimization solution in AnalyticDB and its key role in improving query efficiency for deep paged queries in large data environments.

By Feng Li (Qichuan)

1. Background

Deep paging is a common high-cost query scenario in enterprises. It typically occurs when users need to retrieve the data from later pages (such as beyond page 1000) in massive result sets, leading to sharply increased I/O overhead, significantly higher query latency, and other performance issues. For example, in e-commerce product search scenarios, results are usually sorted by sales volume or score, so that users can quickly find high-quality products. To avoid stuttering caused by loading too much data at a time, applications often display results by page.

A common approach is to sort by the required columns and then use limit + offset to implement paging. Such a query is called a paged query, where limit specifies the number of data entries to be displayed on each page, and offset indicates the start offset of each page of data. For example, if the system displays 100 data records on each page, the SQL statement for querying page 1 is:

SELECT * FROM t_order ORDER BY id LIMIT 0, 100

The SQL statement for querying page 10001 is:

SELECT * FROM t_order ORDER BY id LIMIT 1000000, 100

1.1 What is the Deep Paging Problem?

When a request is paged deeply, the overhead of global sorting and frequently querying the primary table causes a sharp decrease in database performance. Such an issue is called the deep paging problem like the preceding example of querying the data on page 10001.

The situation becomes even more complicated in a distributed database such as AnalyticDB for MySQL. To reduce the amount of data shuffled between different nodes, each storage node performs a topN calculation and the Gather node aggregates the execution results of all storage nodes to obtain the final result set.

Therefore, to ensure correctness, the SQL statement needs to be rewritten before being executed by each storage node. For example, when querying page 10001, each storage node receives the following modified query:

SELECT * FROM t_order ORDER BY id LIMIT 0, 1000100

Originally, only 100 data records were needed, but in the end, a single node had to perform a global sorting on 1000100 × the number of storage nodes. The amount of sorted data increases linearly with the paging depth, causing a severe performance degradation. Additionally, sorting is performed based on memory without disk, which greatly increases the risk of out-of-memory (OOM) errors.

1.2 Why Do We Use Deep Paging in AnalyticDB?

1.2.1 Data export

When you export large amounts of data, the client may become unstable. In this case, the system performs paged queries to pull results in batches. This accounts for the vast majority of deep paging use cases in AnalyticDB for MySQL.

In a distributed environment, direct use of the LIMIT OFFSET clause does not ensure that data is processed in a specific order, which means that different results may be returned each time you query data on the same page. Therefore, you need to explicitly add an ORDER BY clause to ensure that data is exported without duplication or omission.

1.2.2 On-demand (paged) display of full results

The full query results are temporarily stored in the AnalyticDB for MySQL server and displayed by page.

1.2.3 Concurrency control for business reports

If multiple queries are concurrently initiated on the same report, the independent requests increase workloads on the cluster and result in data inconsistency.

This optimization allows you to create a global query within a time window. This significantly improves query performance and cluster stability.

2. AnalyticDB Deep Paging Solution

2.1 Overview of the Solution

Based on the preceding scenarios and user pain points, AnalyticDB for MySQL performs specialized optimization on deep paging to solve the performance problems of deep paged queries. The main idea is based on snapshot cache. During the initial query, the system generates and caches a snapshot of the result set (excluding the LIMIT OFFSET clause). The related information cached is maintained in the metadata table, identified by ID, hereinafter referred to as paging_id.

For each subsequent paging request, the SQL statement will calculate the paging_id to locate the corresponding snapshot data, enabling batch retrieval of results. Note that this optimization does not guarantee strong consistency between the cached data and the original data.

2.2 Key Links

2.2.1 Internal conversion logic

a) AnalyticDB for MySQL receives a paged query with a deep paging optimization hint, parses the query, and performs necessary checks (such as whether the LIMIT OFFSET clause is included). If the query does not meet the requirements, an error is reported.

b) The system parses the SQL statement, removes the LIMIT OFFSET clause, and calculates the hash based on the pattern to generate a paging_id. It then checks the metadata table for the existing paging cache. If no cache is available, it enters Create Paging Cache Mode; otherwise, it proceeds to Query Paging Cache Mode.

c) Create paging cache mode: The server executes an SQL statement to remove the outer LIMIT OFFSET clause and generates a data snapshot for caching. The entire process is asynchronously executed with generated cache files being synchronized to the cache file meta table.

d) Query paging cache mode: The system searches for paging cache metadata based on the paging_id, locates the corresponding cache file and the node where the check is passed, and issues tasks to read the data. If the data is still being prepared, the query will enter a blocked state.

2.2.2 Automatic cache cleanup logic

Elimination based on expiration time: The time interval since the last access is counted. Any paging cache not accessed within the configured time interval (default: 10 minutes) will be automatically removed.

2.2.3 Specify a validity period for a cache table

You can set a validity period for a paging cache. After the validity period ends, the cache table becomes invalid. When you initiate subsequent paged queries that share the same SQL pattern, the system performs database access and updates the cache table. This mechanism is usually used in concurrency control for reports.

2.3 Optimization Source Analysis

The cost of deep paging is rooted in the global sorting of the result set. AnalyticDB for MySQL optimizes for this. Let's analyze it with the following simple example: suppose you want to read data on pages 1-100, with 100 records per page.

General paged query

Each time, you need to query the top offset +100 data records on each storage node, and then a single node is responsible for collecting and aggregating the number of nodes × (offset +100) data records for a global sorting, and taking the final 100 records starting from offset. In this process, each storage node needs to sort the locally stored data multiple times (as shown in the red dotted box in the preceding figure), and the amount of data that is finally sorted globally is linearly related to the offset (as shown in the red solid box in the preceding figure).

Enable deep paging optimization

The paging cache creation phase queries all data on each node by executing the SQL statement without the LIMIT OFFSET clause. Subsequent paged queries read cached data in a consistent order based on the limit and offset. [The schematic diagram of the data export scene is shown above.]

Compared with general paged query, enabling deep paging optimization eliminates unnecessary ORDER BY operations for data export, greatly reducing execution time and resource consumption. In scenarios where full results are displayed on demand, only a single global sorting is performed, and each shard is queried only once.

3. Benefits of Deep Paging Optimization Solution (Partial Results)

Based on TPCH 100G, we evaluate the optimization benefits of the data export scenario. The test cluster contains 6 storage nodes, exporting 1 million records with 100,000 records per page.

-- Use general paging.
select * from lineitem order by l_orderkey,l_linenumber limit 0,100000

-- Enable deep paging optimization.
/*paging_cache_enabled=true*/
select * from lineitem limit 0,100000

Compare the query results from three dimensions: query RT, CPU utilization, and peak memory usage. Overall:

• Each metric of a general paged query increases linearly with the increase of the paging depth due to the global sorting. The larger the offset, the greater the amount of data that is finally sorted globally.

• With deep paging optimization enabled, RT shows a significant latency only in the cache generation phase, which is related to the size of a single file. The final CPU usage and peak memory usage remain low, as the optimization eliminates the ORDER BY operation.

Query RT

For single concurrency execution, the average RT of general paged queries is 54,391 ms during the entire export process. After deep paging optimization is enabled, the average RT drops to 525 ms, improving 102x performance.

CPU utilization

The CPU utilization of general paged queries is about 80%. After deep paging optimization is enabled, the maximum CPU utilization during cache generation is about 20%, and the CPU utilization in subsequent paging is extremely low.

Peak Memory Usage

The peak memory usage of general paged queries increases linearly with the paging depth. The peak memory of page 1 is 225 MB and the peak memory of page 10 is 14 GB. After enabling deep paging optimization, the peak memory is greatly reduced, and the highest memory usage during cache generation is only 584 MB.

4. Summary

Focusing on user pain points, the AnalyticDB deep paging service solution provides the following advantages compared with general paged queries:

• In the data export scenario, unnecessary ORDER BY operations are removed to solve the OOM issues when using deep paging to export.

• In paging display scenarios, the snapshot cache ensures that only one global sorting exists and each shard is queried only once, dramatically improving query performance.

Additionally, the solution also ensures the usability of the feature - users benefit from deep paging optimization without any operational changes. Currently, in an Internet finance scenario, the typical slow query is optimized from 30s to 0.5s, and the performance is improved by over 60 times. We will continue to refine this feature and improve the user experience. Stay tuned for updates!

Community

Low Paged Query Efficiency in Massive Data? An In-depth Look at AnalyticDB Deep Paging Optimization Solution

1. Background

1.1 What is the Deep Paging Problem?

1.2 Why Do We Use Deep Paging in AnalyticDB?

1.2.1 Data export

1.2.2 On-demand (paged) display of full results

1.2.3 Concurrency control for business reports

2. AnalyticDB Deep Paging Solution

2.1 Overview of the Solution

2.2 Key Links

2.2.1 Internal conversion logic

2.2.2 Automatic cache cleanup logic

2.2.3 Specify a validity period for a cache table

2.3 Optimization Source Analysis

General paged query

Enable deep paging optimization

3. Benefits of Deep Paging Optimization Solution (Partial Results)

Query RT

CPU utilization

Peak Memory Usage

4. Summary

Read previous post:

Read next post:

ApsaraDB

You may also like

Comments

ApsaraDB

Related Products

Database for FinTech Solution

Oracle Database Migration Solution

Database Migration Solution

DBStack