- ๐ซ๐ทFrance slayne40
For pager type urls, verify if the selector return an url.
- ๐ซ๐ฎFinland kekkis Pirkkala
I believe the patches starting from #112 contain an error when using the 'paginator' type and when the $selector_data is scalar. The comparison operator between $num_items and $selector_data has shifted from '==' to '!=' for an unknown reason. How this manifests is that only the first page ever gets processed.
In this new patch I try to explain the thinking in the comment a bit better, plus of course fix the operator.
Leaving as Needs work since there still are no tests.
- last update
over 1 year ago Patch Failed to Apply - last update
over 1 year ago 185 pass, 2 fail - ๐บ๐ธUnited States sassafrass
I am testing using the latest patch. In my yaml, I am using the paginator type because the urls provided in the JSON endpoints are not valid for my use case.
urls: - http://services.baltimorecountymd.gov/api/hub/pets/pets?status=Adoptable - http://services.baltimorecountymd.gov/api/hub/pets/pets?status=Lost pager: type: paginator selector: /metaData/ paginator_type: page_number default_num_items: 10 page_key: page size_key: recordsPerPage
The paginated urls are generated as expected. However, it never stops generating urls. Urls being generated are valid but have no records. For example see: https://services.baltimorecountymd.gov/api/hub/pets/pets?status=Adoptabl....
My particular use case has non-scalar $selector_data but the JSON is not in format expected by this segment of code, which always evaluates to true.
else { // If we have an array of rows if (count($selector_data) > 0) { $next_urls[] = Url::fromUri($path['path'], [ 'query' => $path['query'], 'fragment' => $path['fragment'], ])->toString(); } } }
- Merge request !81Issue #2640516: Support paging through multiple requests โ (Merged) created by jcandan
- Open on Drupal.org โCore: 9.5.x + Environment: PHP 8.1 & MySQL 8last update
over 1 year ago Not currently mergeable. - last update
over 1 year ago 189 pass, 2 fail - ๐ซ๐ฎFinland kekkis Pirkkala
The return type of
\Drupal\migrate_plus\Plugin\migrate_plus\data_parser\Json::getSourceData
cannot be set toarray
reliably since it might returnNULL
(per its own code) or any scalar (fromjson_decode
). This is my suggestion how to fix it, even though usingmixed
might do as well. - Status changed to Needs review
about 1 year ago 8:27pm 7 November 2023 - last update
about 1 year ago 189 pass, 1 fail - ๐บ๐ธUnited States jacobbell84
Comment 100 introduced a logic change to the getSourceData function in the Json data parser plugin that I believe wasn't intended.
// Otherwise, we're using xpath-like selectors. $selectors = explode('/', trim($item_selector, '/')); $return = $this->sourceData; foreach ($selectors as $selector) { + if (!isset($return[$selector])) { + return NULL; + } if (!empty($selector) || $selector === '0') { $return = $return[$selector]; }
With that change, it's no longer possible to use the Json data parser without setting an item selector. This is because the 'explode' method will return an array with one empty element if an empty item selector is passed to it, which causes the new check to fail 100% of the time and returns null. I've changed that logic to what's below, which I believe solves that problem and keeps the original intended functionality.
// Otherwise, we're using xpath-like selectors. $return = $this->sourceData; if (!empty($item_selector)) { $selectors = explode('/', trim($item_selector, '/')); foreach ($selectors as $selector) { if (!isset($return[$selector])) { return NULL; } $return = $return[$selector]; } } elseif (sizeOf($return) === 0) { return NULL; }
The last submitted patch, 125: migrate_plus-support-paging-2640516-125.patch, failed testing. View results โ
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.- Status changed to Needs work
about 1 year ago 11:04pm 8 November 2023 - ๐บ๐ธUnited States nadavoid
Is there a way to use this patch for the paging scenario I have? The things in particular that I'm not sure how to solve:
- The token for the next page is unpredictable; it needs to be read from the current page
- The identifier for the next page is only a token, not a full URL
My JSON source has this structure:
{ "response": { "docs": [ {"one", "two", "three"} ] }, "count": "329", "nextPageToken": "qwk9" }
The URL of the next page is the same as the current page, with the change of a URL param:
&pageToken=qwk9
I'm thinking that I'll need to update the patch or extend it using custom code, but wanted to check here first. If updating the patch is there an existing pager type that would be best to model this on? These are the pager types that I see recognized in the patch so far:
- urls
- cursor
- page
- paginator
- ๐บ๐ธUnited States nadavoid
Commenting with what I understand to be the purpose of some of the pager options for the JSON plugin.
selector: The path to the JSON item identifying the next page.
type
urls
: selector points to one or more URLs.cursor
: selector is just a portion of the next page URL. Set `key` to the URL parameter key that should be passed.page
: selector is a current page number, and expected to increment: 1, 2, 3, etc.paginator
: not sure
key: Parameter key in the URL, will be given a value that identifies the page to request. If key is set to "page", then the resulting generated URL will contain "?page=[value-from-current-json]"
More parameters to document:
size_key
page_key
selector_max
default_num_items
paginator_type - ๐บ๐ธUnited States jacobbell84
Fixing some incorrect variable names (A couple spots were still using $this->itemSelector instead of $item_selector) and fixing another edge case where the item_selector is blank.
- last update
about 1 year ago 187 pass, 3 fail - ๐ช๐ธSpain guiu.rocafort.ferrer Barcelona
guiu.rocafort.ferrer โ made their first commit to this issueโs fork.
- Open on Drupal.org โCore: 9.5.x + Environment: PHP 8.1 & MySQL 8last update
about 1 year ago Not currently mergeable. - Open on Drupal.org โCore: 9.5.x + Environment: PHP 8.1 & MySQL 8last update
about 1 year ago Not currently mergeable. - Open on Drupal.org โCore: 9.5.x + Environment: PHP 8.1 & MySQL 8last update
about 1 year ago Not currently mergeable. - Open on Drupal.org โCore: 9.5.x + Environment: PHP 8.1 & MySQL 8last update
about 1 year ago Not currently mergeable. - Open on Drupal.org โCore: 9.5.x + Environment: PHP 8.1 & MySQL 8last update
about 1 year ago Not currently mergeable. - last update
about 1 year ago 195 pass - ๐ช๐ธSpain guiu.rocafort.ferrer Barcelona
I updated the issue fork with the latest 6.0.x branch, so it is mergeable now. I also made a few changes so the Gitlab CI is available and runs de tests, and also fixed some of the tests.
Some of the changes i made, which might be arguable, are:
- When the item_selector does not exist, the getSourceData returns now an empty array instead of null. This way, the plugin acts as there is no rows to be imported and does nothing.
- When the item_selector is NULL or it is not defined, it is set to '' by default. In that case i believe the getSourceData method should return the whole sourceData contents.
- One of the tests failed because the $item_selector parameter for getSourceData was set to type string, but it was passing an integer value. Inside the function, it checked if the parameter is_numeric, to apply the backwards compatibility for depth selection. I changed the method definition to accept also integer values.
My plan is to next write some tests for the pager functionality, and go from there to fix potential issues and make sure it works ok, before merging into the main branch.
- last update
about 1 year ago Unable to generate test groups - last update
about 1 year ago Unable to generate test groups - ๐ช๐ธSpain guiu.rocafort.ferrer Barcelona
The last commit adds test cases for the pager types urls, cursor, and page. The tests for paginator are still missing thought.
I also expanded a little bit the documentation on the pager types here: https://www.drupal.org/docs/8/api/migrate-api/migrate-source-plugins/mig... โ
- last update
about 1 year ago CI aborted - last update
about 1 year ago 207 pass - Status changed to Needs review
about 1 year ago 10:21am 28 December 2023 - last update
about 1 year ago Patch Failed to Apply - ๐ช๐ธSpain guiu.rocafort.ferrer Barcelona
I am changing the status to "Needs review", since i wrote tests for all the pager types, and also updated the documentation to reflect the new functionality.
Documentation: https://www.drupal.org/docs/8/api/migrate-api/migrate-source-plugins/mig... โ
- ๐บ๐ธUnited States nadavoid
The issue fork has merge conflicts against the main branch, with the recent update of migrate_plus to 6.0.2. The latest patches also no longer apply.
@guiu.rocafort.ferrer Could you merge the latest into your issue fork and resolve conflicts? I'll be happy to test after that, since I'm actively using this, and will continue to for quite a while.
- ๐ช๐ธSpain guiu.rocafort.ferrer Barcelona
Hi @nadavoid, the merge request should now be good to go.
- Status changed to RTBC
about 1 year ago 4:02pm 19 January 2024 - ๐บ๐ธUnited States nadavoid
Thanks for the quick update @guiu.rocafort.ferrer! I really appreciate it. I've successfully applied the patch from the MR on 6.0.2 and confirmed that the cursor pagination still works in my installation.
The code looks good good to me, and the tests seem comprehensive. So I'm marking this RTBC. Others who have been deeper in this issue are of course also welcome to review. I'm strongly in favor of merging soon, and if adjustments are needed or bugs are found, they can be handled in smaller followup issues.
I'm also uploading the current version of the patch so that people have something stable to use in their builds today.
- ๐ฌ๐งUnited Kingdom longwave UK
I'm using this with the
paginator
pager type and while it works well, it is repeating HTTP requests - and my source is relatively slow to respond, so this is slowing down my migrations.If I add logging to
Http::getResponse()
I see that requests are made up to three times for the same URL:get https://[redacted]?size=500... done get https://[redacted]?size=500&offset=500... done get https://[redacted]?size=500... done get https://[redacted]?size=500&offset=500... done get https://[redacted]?size=500&offset=1000... done get https://[redacted]?size=500&offset=500... done get https://[redacted]?size=500&offset=1000... done get https://[redacted]?size=500&offset=1500... done get https://[redacted]?size=500&offset=1000... done
I hate to say this at this stage but I wonder if the different pagers need to be their own type of plugin? That would mean they could be more easily extended for different cases.
- ๐ช๐ธSpain guiu.rocafort.ferrer Barcelona
Hi @longwave, I understand your concerns about the performance issues, and the redundant http requests, i do believe too that this is not optimal, and there is room for improve.
This issue have been opened since 2015, and it has been difficult to push it forward and make it to merge with the development branch, so i am worried that addressing those issues would delay even more. I personally have a few sites that make use of the patch for a while now, and i believe some other people might be in the same situation.
So i believe it would be better to merge the issue into development, and then create a follow-up issue to improve the performance situation.
What do the other think about this ? - ๐ฌ๐งUnited Kingdom longwave UK
That decision is up to the Migrate Plus maintainers, I think it would be fine to defer to followups given this fixes some immediate issues and both removing the additional requests and refactoring to use plugins could be done separately.
- ๐ฉ๐ฐDenmark googletorp
I've used this in the past and compared to the alternative (which is to fetch everything in one go) this is a major improvement. This is also an opt-in feature, so while making a lot of requests can cause problems, it's something the developers using this can determine if it's a problem or not. From what I understand if you're not using the feature we won't have any performance change.
Maybe it's better to get it in, solve a lot of people's problems and get real life feedback on which performance issues would make sense to address.
I think this is RTBC.
- heddn Nicaragua
This is likely one of the longest running requests in the migrate plus queue. Let's land it and make incremental improvements on what we have here.
-
heddn โ
committed b79a5735 on 6.0.x authored by
jcandan โ
Issue #2640516 by jcandan, m.lebedev, berenddeboer, scott_euser, segi,...
-
heddn โ
committed b79a5735 on 6.0.x authored by
jcandan โ
- Status changed to Fixed
6 months ago 7:44pm 18 July 2024 Automatically closed - issue fixed for 2 weeks with no activity.
- ๐ฉ๐ฐDenmark ressa Copenhagen
Yes, thanks everyone for landing this feature with great tenacity. This is one of the things which makes so Drupal so great -- while an issue may take some time, many will eventually be resolved, and make Drupal even greater.
- ๐บ๐ฆUkraine init90 Ukraine
I've created an issue to address one of the points from #139 - ๐ Omit an extra request when the pager is used Active