Logo
Paging Query

Paging Query #

Performance Bottlenecks #

Paging queries are fully supported by MySQL, PostgreSQL and Oracle, but only partially supported by SQLServer due to the complexity of paging queries. In terms of paging performance, paging with too large query offsets can lead to poor performance of the database fetching data. Take MySQL as an example:

SELECT * FROM t_order ORDER BY id LIMIT 1000000, 10

This SQL will make MySQL skip 1,000,000 rows and then fetch 10 rows without using indexes, so you can imagine the performance. And in the case of splitting the database and table (assuming 2 libraries), the SQL would be rewritten to ensure the correctness of the data as follows:

SELECT * FROM t_order ORDER BY id LIMIT 0, 1000010

This means that all the records before the offset are removed and only the last 10 records after sorting are fetched. This will further aggravate the performance bottleneck when the database is already slow in execution. Because the original SQL only needs to transfer 10 rows to the client, the rewritten SQL will transfer 1,000,010 * 2 rows to the client side.

Optimization of DBPlusEngine #

DBPlusEngine has been optimized in 2 ways.

First, stream processing + merge sorting is used to avoid memory overload. Since SQL rewriting inevitably takes up extra bandwidth, it does not cause memory spikes. Contrary to intuition, most people think that DBPlusEngine will load all 1,000,010 * 2 records into memory, which in turn will take up a lot of memory and cause a memory overflow. However, since the records of each result set are ordered, DBPlusEngine only fetches the current result set record of each slice for each comparison, and the only record that resides in memory is the current cursor pointing to the result set of the currently routed slice. The time complexity of merging m sorted arrays of length n by merging idea is O(mn(log m)), and generally the number of slices m is small enough to be considered as O(n) with little performance loss.

Secondly, DBPlusEngine further optimizes the queries that only fall into a single slice. Requests that fall to a single-slice query do not require SQL rewriting to ensure correctness of records, so in this case, DBPlusEngine does not perform SQL rewriting, thus saving bandwidth.

Paging solution optimization #

Since LIMIT does not query data by index, paging by ID is a better solution if the continuity of IDs can be guaranteed:

SELECT * FROM t_order WHERE id > 100000 AND id <= 100010 ORDER BY id

Or the next page by recording the ID of the last record of the last query result:

SELECT * FROM t_order WHERE id > 100000 LIMIT 10

Paging Subqueries #

Both Oracle and SQLServer paging needs to be handled by subqueries, and DBPlusEngine supports paging-related subqueries.

■ Oracle

Support for paging using rownum:

SELECT * FROM (SELECT row_.*, rownum rownum_ FROM (SELECT o.order_id as order_id FROM t_order o JOIN t_order_item i

ON o.order_id = i.order_id) row_ WHERE rownum <= ?) WHERE rownum > ?

The rownum + BETWEEN paging method is not currently supported.

■ SQLServer

Supports paging with TOP + ROW_NUMBER() OVER:

SELECT * FROM (SELECT TOP (?) ROW_NUMBER() OVER (ORDER BY o.order_id DESC) AS rownum, * FROM t_order o) AS temp

WHERE temp.rownum > ? ORDER BY temp.order_id

Support paging with OFFSET FETCH after SQLServer 2012:

SELECT * FROM t_order o ORDER BY id OFFSET ? ROW FETCH NEXT ? ROWS ONLY

Paging using WITH xxx AS (SELECT ⋯) is not currently supported. Hibernate-based SQLServer paging is not currently supported because the SQLServer paging statement automatically generated by Hibernate uses the WITH statement. Paging using two TOP + subqueries is also not supported at this time.

■ MySQL, PostgreSQL

Both MySQL and PostgreSQL support LIMIT paging without subqueries.

SELECT * FROM t_order o ORDER BY id LIMIT ? OFFSET ?