Cross-shard pagination queries

Problem

  • In sharding cases, how will the following pagination query be executed

SELECT * FROM order_tab ORDER BY id LIMIT 4 OFFSET 2

Global query

Steps

  1. "LIMIT x OFFSET y" gets transformed to "LIMIT x+y OFFSET 0".

  1. Suppose that there are N tables hit. After getting results from N tables, it could be merged together using "merge sort".

Cons - Low performance

High network volume

  • For a query "LIMIT 10 OFFSET 1000"

  • In cross-shard query scenarios, it will be tranformed to "LIMIT 1010 OFFSET 1000".

  • If there are N table hit, in total N * 1010 rows need to be transmitted.

High memory footprint

  • Since query results for N tables need to sit inside memory and then merged, the memory footprint will be high.

Average pagination

Steps

  • Divide the LIMIT and OFFSET by 2

Cons - Accuracy

  • It will get an approximate answer in most cases.

  • The improved version could be an weighted average pagination.

Forbid pagination

  • Cross shard queries are forbidden.

  • Assume that one page data is only in one shard. Then the query could be simplified.

Intermediate table

  • Use an additional table for sorting purpose.

  • Assume that we use update time for ranking purpose

    1. During search, we could first look up inside intermediate table.

    2. Then goes to the target DB, and looks for the message.

Intermediate table

Last updated

Was this helpful?