Hey just chiming in to say this is definitely happening, and the solution of adding an extra sort on a unique field does indeed seem to resolve the issue. Thanks!
- πΊπΈUnited States helioha Philadelphia, PA
This is still an issue and was a major headache to fix on a Drupal + GatsbyJS project.
In my case, media files were missing AND getting duplicated.
Adding a sort to JSON API fixed the issue, but I also found something else.
Two identical unordered SQL queries return a result in different order depending on the presence of
limit
parameter.This seems to be a bug with MySQL: LIMIT clause results in duplicate data across pages
People seem to disagree if this is a real bug or not. But long story short, MySQL does require an
order
to be specified for a consistent paginated result with no duplicates or missing records.Here is my problem query without the
limit
parameter and result.MySQL [db]> SELECT media.vid AS vid, media.mid AS mid FROM media media INNER JOIN media_field_data media_field_data ON media_field_data.mid = media.mid WHERE media_field_data.bundle = 'image' GROUP BY media.vid, media.mid; +------+-----+ | vid | mid | +------+-----+ | 159 | 1 | | 2 | 2 | | 160 | 3 | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | | 8 | 8 | | 9 | 9 | | 161 | 10 | | 11 | 11 | | 12 | 12 | | 13 | 13 | | 14 | 14 | | 15 | 15 | | 16 | 16 | | 17 | 17 | | 18 | 18 | | 19 | 19 | | 20 | 20 | | 21 | 21 | | 22 | 22 | | 151 | 23 | ... 82 rows in set (0.002 sec)
And here is the same query with limit parameter and result.
MySQL [db]> SELECT media.vid AS vid, media.mid AS mid FROM media media INNER JOIN media_field_data media_field_data ON media_field_data.mid = media.mid WHERE media_field_data.bundle = 'image' GROUP BY media.vid, media.mid limit 82; +------+-----+ | vid | mid | +------+-----+ | 2 | 2 | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | | 8 | 8 | | 9 | 9 | | 11 | 11 | | 12 | 12 | | 13 | 13 | | 14 | 14 | | 15 | 15 | | 16 | 16 | | 17 | 17 | | 18 | 18 | | 19 | 19 | | 20 | 20 | | 21 | 21 | | 22 | 22 | | 24 | 24 | | 25 | 25 | | 26 | 26 | | 46 | 46 | | 47 | 47 | | 48 | 48 | | 49 | 49 | | 50 | 50 | | 51 | 51 | | 65 | 63 | | 66 | 64 | | 67 | 65 | | 68 | 66 | | 69 | 67 | | 70 | 68 | | 93 | 27 | ... ... ... 82 rows in set (0.002 sec)
You can see how the order of the results are different. I'm using MySQL 8.0.
When I paginate the results, just like how JSON API does, without a sort, I do see missing AND repeated items on subsequent queries. Here is page 1.
MySQL [db]> SELECT media.vid AS vid, media.mid AS mid FROM media media INNER JOIN media_field_data media_field_data ON media_field_data.mid = media.mid WHERE media_field_data.bundle = 'image' GROUP BY media.vid, media.mid LIMIT 51 OFFSET 0; +------+-----+ | vid | mid | +------+-----+ | 2 | 2 | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | | 8 | 8 | | 9 | 9 | | 11 | 11 | | 12 | 12 | | 13 | 13 | | 14 | 14 | | 15 | 15 | | 16 | 16 | | 17 | 17 | | 18 | 18 | | 19 | 19 | | 20 | 20 | | 21 | 21 | | 22 | 22 | | 24 | 24 | | 25 | 25 | | 26 | 26 | | 46 | 46 | | 47 | 47 | | 48 | 48 | | 49 | 49 | | 50 | 50 | | 51 | 51 | | 65 | 63 | | 66 | 64 | | 67 | 65 | | 68 | 66 | | 69 | 67 | | 70 | 68 | | 93 | 27 | | 95 | 56 | | 96 | 43 | | 97 | 84 | | 100 | 52 | | 106 | 87 | | 107 | 85 | | 108 | 83 | | 109 | 82 | | 110 | 81 | | 111 | 80 | | 112 | 79 | | 113 | 78 | | 135 | 93 | | 136 | 95 | | 137 | 96 | | 140 | 98 | +------+-----+ 51 rows in set (0.002 sec)
and here is page 2.
MySQL [db]> SELECT media.vid AS vid, media.mid AS mid FROM media media INNER JOIN media_field_data media_field_data ON media_field_data.mid = media.mid WHERE media_field_data.bundle = 'image' GROUP BY media.vid, media.mid LIMIT 51 OFFSET 50; +------+-----+ | vid | mid | +------+-----+ | 180 | 62 | | 65 | 63 | | 66 | 64 | | 67 | 65 | | 68 | 66 | | 69 | 67 | | 70 | 68 | | 113 | 78 | | 112 | 79 | | 111 | 80 | | 110 | 81 | | 109 | 82 | | 108 | 83 | | 97 | 84 | | 107 | 85 | | 152 | 86 | | 106 | 87 | | 183 | 90 | | 148 | 91 | | 135 | 93 | | 149 | 94 | | 136 | 95 | | 137 | 96 | | 147 | 97 | | 140 | 98 | | 146 | 99 | | 179 | 100 | | 178 | 101 | | 177 | 102 | | 176 | 103 | | 172 | 104 | | 184 | 105 | +------+-----+ 32 rows in set (0.002 sec)
You can see that
mid
3 is missing andmid
s 64-68 are duplicating on second page. - πΊπΈUnited States m.stenta
This seems to be a bug with MySQL: LIMIT clause results in duplicate data across pages
We run Drupal on PostgreSQL and it is also an issue in that context.
We have been consistently hearing about it from API users of farmOS β , and we just tell everyone to add a
?sort=drupal_internal__id
to their queries. We do this automatically in our farmOS.js and farmOS.py API libraries to save our users the headache and head scratching when they can't find records in the API endpoints or find duplicate records.But this really should be fixed in core IMO. This is a major bug with the
jsonapi
module in my opinion, and it is not documented anywhere.