Unpredictable result sets when chunking queries without an order

Created on 18 August 2022, almost 2 years ago
Updated 12 February 2023, over 1 year ago

Problem/Motivation

When retrieving records it's possible to get into situations where https://git.drupalcode.org/project/rest_oai_pmh/-/blob/dc118ef3b57937243... returns duplicate results as limit/offset chunking occurs without an order by clause.

For reference https://www.postgresql.org/docs/current/queries-limit.html

When using LIMIT, it is important to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query’s rows. You might be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering? The ordering is unknown, unless you specified ORDER BY.

While this is PostgreSQL specific documentation it also applies to MySQL.

Steps to reproduce

Ingest n records such that the configured results to display for the resumption token requires paging.
Notice when using the resumption tokens records from previous chunks appear in the current result set.

Proposed resolution

Add a default order by to enforce a deterministic ordering.

πŸ› Bug report
Status

Fixed

Version

2.0

Component

Code

Created by

πŸ‡¨πŸ‡¦Canada JordanDukart PEI

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.69.0 2024