Duplicates with batch export csv.

Created on 18 March 2020, about 5 years ago
Updated 22 February 2023, about 2 years ago

Hi,

I have a view setup with views data export attached to export a csv of the filtered rows. The number of rows in the export is correct in comparison to the view but the export contains duplicates and some rows missing.

I've turned batch export off and everything works ok.

Cheers Dan

๐Ÿ› Bug report
Status

Active

Version

1.0

Component

Code

Created by

๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom danharper

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • ๐Ÿ‡ฒ๐Ÿ‡ธMontserrat eloivaque Catalan Countries

    I have the same problem and I apply solution of #3 and it's works for me.

    I think the problem is generated by order of view, and number of iteration of batch. In my case, I have 1 view with content type, and reference 9 paragraphs. This generate 9 rows.

    Content 1 Pragraphs 1
    Content 1 Pragraphs 2
    Content 1 Pragraphs 3
    Content 1 Pragraphs 4
    Content 1 Pragraphs 5
    Content 1 Pragraphs 6
    Content 1 Pragraphs 7
    Content 1 Pragraphs 8
    Content 1 Pragraphs 9

    The iteration of batch is defined to 5.
    And the view is order date created of Content.

    I think that, when batch get 5 elements with sql consult, return this.

    Content 1 Pragraphs 2
    Content 1 Pragraphs 4
    Content 1 Pragraphs 3
    Content 1 Pragraphs 8
    Content 1 Pragraphs 6

    Then the next sql return me 5 elements more, ordened by date of content type, and this for me, generate dupicate rows of paragraphs, and remove other paragraphs.

    Content 1 Pragraphs 2
    Content 1 Pragraphs 4
    Content 1 Pragraphs 3
    Content 1 Pragraphs 8
    Content 1 Pragraphs 6
    Content 1 Pragraphs 1
    Content 1 Pragraphs 2
    Content 1 Pragraphs 4
    Content 1 Pragraphs 9

    I solved order the view with ID paragraphs unique value. And it's works for me.

  • Same story here, an additional order criteria helped to solve the issue. see #3
    The bug is nasty and hard to grasp: The export is working and even the number of records is matching. You have to drill into your data to see the problem.

    This bug does really need more love. Also, in the first place I was wondering, why we need to batch execute an export with only 10k rows anyway ...

  • ๐Ÿ‡ฏ๐Ÿ‡ตJapan hktang

    We have encountered the same issue, and #3 ๐Ÿ› Inconsistent results when doing batch exports Active fixes our export.

    We had to add two IDs, the Content ID and the Author ID, for the export to work correctly.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States DamienMcKenna NH, USA

    We were seeing this problem with an export too, only our export was losing hundreds of records. The table was sorting by creation date and the problem was that migrated data from a number of years back had the same creation date, and due to problems with the pagination query it would show a provide result set over time. I added an extra sort field for the node ID, which is after the primary sort mechanism (the previously mentioned creation date), and now it works properly.

    It might be worth adding some official documentation about this, maybe even add something in the display plugin's form, as I suspect there might be lots of sites having problems with this without even knowing it.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States DamienMcKenna NH, USA

    Retitling the issue as it's a general problem of inconsistent data, rather than just duplicate records.

  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia vishal.kadam Mumbai

    I've identified the source of the batch export limit problem. It has to do with the sequence of query results.

    In order to prevent changes in the query result sequence, add the unique fields to the sort.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States jhedstrom Portland, OR

    I updated the project page with a note about this issue. Thanks all for digging in and figuring this out!

  • Merge request !44Document unique sorting for batch mode โ†’ (Merged) created by jhedstrom
  • Status changed to Needs review 9 months ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States jhedstrom Portland, OR

    This MR updates the README as well.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom scott_euser

    Looks good, this was helpful for me too!

  • Pipeline finished with Skipped
    5 months ago
    #375880
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom steven jones

    steven jones โ†’ made their first commit to this issueโ€™s fork.

  • Status changed to Fixed 5 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom steven jones

    Committed the change to the readme.

  • Automatically closed - issue fixed for 2 weeks with no activity.

  • Status changed to Fixed about 1 month ago
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia anmol singh Gurgaon ๐ŸŒŽ

    1- Will this solution still work if i am already having sort criteria like changed,published and on top of that if i applied nid,uid to that sort criteria. Or it should be only nid or uid sort criteria.

    2- will this help in cron job also giving inconsistent result if i am using batch process dataexport::processBatch

Production build 0.71.5 2024