Thanks for the patch @kasperg. I tested locally running both CLI and web requests and it works as expected.
I'm getting "MySQL server has gone away errors" after long running drush operations like search api indexing and running migrations with the lateruntime processor enabled. Those operations can put thousands of cache tags in the queue and it seems like it is trying to process many of them at the end of the request, or at the end of batches which is resulting in a packet size greater than max_allowed_packet. My max_allowed_packet setting is already quite high and I don't have much room to increase so I agree with the other comments that it would be better to at least have the option to disable it for CLI. Thanks for the MR @Kasperg
Thanks for creating the module! I added glossary support!
All the tests now seem to pass again except for this fail on the phpunit (next major) pipeline:
1)
Drupal\Tests\redirect\FunctionalJavascript\RedirectJavascriptTest::testRedirectUI
Behat\Mink\Exception\ExpectationException: The string "Are you sure you
want to delete the URL redirect from /web/non-existing?key=value to
/web/node?" was not found anywhere in the HTML response of the current
page.
however looking at the pipelines from other commits it looks like its unrelated to my code.
Ran the test-only changes to confirm the test catches the bug. Looks like fixing this has caused knock on effects to other tests though. I'll look into it.
Closing this as it can be prevented by altering the langcode in hook_pathauto_alias_alter rather than hook_path_alias_presave. Pre save occurs too late but the other one is fired before we save the alias so it prevents the duplicate issue.
Also, I guess that the core path_alias should be the one preventing duplicates being saved.
Looks like this patch was on the right path from #95 and then veered away from it. I strongly agree with #95 that we should be translating entities rather than changing their langcode as that is destructive and would confuse editors. I believe the fix in #52 might have been trying to address that but it may be redundant now. Also, it only prevents deleting entities used by the parent node but that could mean they are still deleted if referenced by other entities. Heavily based this MR off #95.
Changes in MR 120:
1) Inline entities are translated rather than having their language changed.
2) Re-attached the submit handler from #95 that was removed and handled the case where the main entity form is submitted but the inline entity form hasn't been opened yet.
3) Removed the deletion code that was trying to prevent entities being removed if referenced by the parent entity (I think that is probably better handled by something like entity_reference_integrity module). See above for reasons.
4) Translations get their own created/updated timestamps when they are created rather than inheriting from the default translation. This helps them show up in the /admin/content view when the parent page is translated.
5) Rename the testTranslation test to testSymmetricTranslation and added cases for translations and timestamps. Also created a new testAsymmetricTranslation test to test asymmetric behaviors which was more thorough than the previous behaviour.
Feedback welcome.
Regarding the comment in #127 the line number you indicated the error on doesn't match up with my MR so I don't think its related. Anyway I'm working on MR 120 now.
I thought this was an issue with the module more generally but turns it out it only happens when using the asymmetric translations patch so will close this and look into it over there.
Just to add some further steps to reproduce. You need uncheck "Include the current page as a segment in the breadcrumb" and "Use the real page title when available" which are turned on by default. Managed to reproduce it on a fresh simplytest.me install
The latest commit of https://git.drupalcode.org/issue/inline_entity_form-2822764/-/tree/2822764-support-adding-new-entities-when-translating applies cleanly to 3.0.0-rc20 as a patch. I made a small change to remove @label
from the translatable markup that was never actually getting replaced.
The failing TranslationTest
needs to be fixed and it seems to me that this patch has unintentionally changed the behavior even when allow asymmetric translations hasn't been enabled as the test fails on line 108 which is well before we enable it.
// Both inline nodes should now be in English.
$first_inline_node = $this->drupalGetNodeByTitle('Kann ein Känguru höher als ein Haus springen?');
$second_inline_node = $this->drupalGetNodeByTitle('Can a kangaroo jump higher than a house?');
$this->assertSame('en', $first_inline_node->get('langcode')->value, 'The first inline entity has the correct langcode.');
$this->assertEquals('en', $second_inline_node->get('langcode')->value, 'The second inline entity has the correct langcode.');
Seeing as this is still occurring I re-opened and opened a MR. Added code in the AliasStorageHelper
class rather than PathautoGenerator
I had a look at the linked issues and tried some patches but they didn't stop this issue occurring. I also downloaded the latest version of this module and and it's still occurring. Where was this fixed? (Using core 10.2)
The layout issue mentioned in #10 is fixed in https://www.drupal.org/project/drupal/issues/3341669 🐛 Dropbuttons and regular buttons are misaligned Fixed
Even though workbench_moderation does cause this with its long button labels I feel like it should be fixed in core as I don't see a reason why the list items in the dropbutton shouldn't be at least as wide as the parent item? I also found on mobile that even the main button gets cut off when the text is too long (see screenshot). Shouldn't it wrap?
Hi @mikelutz
For my personal use case I actually don't need to catch an exception when a row is skipped, I am more interested in when the migration fails. The only reason I have included it is because the original issue had it in scope. I would happily reduce the scope and remove changes to row skipping if that would help simplify things?
Still not sure why it only failed in ci only and not locally but changed the test to just need the exception name rather than the whole object since the serialization error was only happening in the test state recording. Tests passing now.
Setting back to needs review as I found a small edge case after the type hinting changes when the OS mappings are not set.
Have updated the code based on feedback. Looks like that error was unrelated to my code but I was able to trace it to the search_api_item
table not being installed in the SpellCheckTest so I added there.
I took a stab at this with MR#52
It retrieves the mappings from the OpenSearch server and compares them with the local Drupal mappings after event subscribers have fired.
I don't think comparing the original index fields to the new ones will work because the modified mappings after event subscribers have fired aren't stored anywhere and so it's not really possible to compare.
The other thing to be aware of is the dynamic mappings in OpenSearch. If you don't explicitly provide a type it will guess based off the data that is indexed. That means that every time you click save even if nothing has changed it will clear the index (because there are no local mappings). This is mostly a problem if you create fields in processors etc and the fix is just to use the IndexParamsEvent to explicitly give your custom fields the correct types. I have found that if you incorrectly give a Drupal field that is a float for example a string type in Search API then OS will override it with the float type in the mappings once data has been indexed. That will also cause clearing rather than re-indexing. So you need to be careful to assign the correct type.
Did not tackle settings (analysers etc), only field mappings and doesn't negate the need for a blue/green deployment but should definitely help reduce the amount of times you need to switch indexes.
Same as #13 when using the patch at #15 which is using getUserInput
the blockForm
method is called twice on load. The first time the defaultConfiguration is correct. The second time all the defaultConfiguration values are null. When I switch getUserInput
to getValues
like #6 it is only called once and the defaultConfiguration is correct.
This is possible using rank_feature queries in opensearch and it is implemented in search_api_opensearch. See https://git.drupalcode.org/project/search_api_opensearch/-/commit/fa23b3e5f6d24d39d126ab0ad40b974997e16ae4
The search_api_boost value is created for each document and you can use a rank_feature query on that field to boost results.
I saw this as well.
Steps to reproduce:
1. Enable PHP8.2 on your environment.
2. Switch to a branch running core 10.2 (where enforceDefaultTranslation is added to ContentEntityBase).
3. Clear cache (no warnings).
4. Switch to a branch running core 10.1 (no enforceDefaultTranslation).
5. Clear cache (warnings).
But don't think this is an issue with this module.
I've converted those annotations to php attributes. Unfortunately when I've merged in the latest code from 11.x there is a test failure in testRowSkippedEvent
however the error message is obscured by the annoying "Exception: Serialization of 'Closure' is not allowed" message which doesn't actually show what's wrong. When I run MigrateEventsTest.php locally I don't get any errors so this one is hard to debug. Apparently this is quite a common issue faced in Migrate tests (see related issue).
I do have a stack trace:
/builds/issue/drupal-2756269/core/lib/Drupal/Core/Cache/MemoryBackend.php:119
/builds/issue/drupal-2756269/core/lib/Drupal/Core/Cache/CacheCollector.php:275
/builds/issue/drupal-2756269/core/lib/Drupal/Core/State/State.php:102
/builds/issue/drupal-2756269/core/modules/migrate/tests/src/Kernel/MigrateEventsTest.php:481
/builds/issue/drupal-2756269/vendor/symfony/event-dispatcher/EventDispatcher.php:206
/builds/issue/drupal-2756269/vendor/symfony/event-dispatcher/EventDispatcher.php:56
/builds/issue/drupal-2756269/core/modules/migrate/src/MigrateExecutable.php:247
/builds/issue/drupal-2756269/core/modules/migrate/tests/src/Kernel/MigrateEventsTest.php:354
That shows that it is trying to serialize the cached data (which would contain an exception). My guess would be that the exception data has a closure in there somewhere but like I said very hard to debug if I can't reproduce it locally.
Hopefully someone can help pinpoint the issue!
achap → created an issue.
Just to remove any ambiguity I updated the title to reflect that this is an issue with the 8302 update hook not 8301. I think that was entered in error.
From looking at the committed code in that other issue I don't see how it can be related or fix this issue as the modified code only affects the password_policy_history table while this primarily affects user__field_pending_expire_sent. Unless the conclusion is that the error occurs due to db vendors not following Ansi SQL as suggested in #16? My website that was affected by this is hosted with Acquia if that helps?
Kristen Pol → credited achap → .
Just ran into another error code from our remote service (429) so thought I would attempt to make this a little bit more flexible rather than just logging specific error codes. Unfortunately I couldn't find much documentation on OpenSearch error responses so it's a little hard to test.
This error response from OpenSearch:
Array
(
[_index] => index_a
[_id] => entity:node/152691:en-AU
[status] => 429
[error] => Array
(
[type] => status_exception
[reason] => Error from remote service: {
"error": {
"message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.",
"type": "insufficient_quota",
"param": null,
"code": "insufficient_quota"
}
}
)
)
Ends up looking like this in the logs:
Error from remote service: { "error": { "message": "Incorrect API key provided: foobar. You can find your API key at https://platform.openai.com/account/api-keys.", "type": "invalid_request_error", "param": null, "code": "invalid_api_key" } } for id: entity:node/189871:en-AU. Status code: 401
I updated the log message to include the status code and also changed "for index" to "for id" since it is logging the id of the item not the index it's in.
I'm getting some composer errors in my pipeline but I think it's unrelated to this change:
$ composer install $COMPOSER_EXTRA
No composer.lock file present. Updating dependencies to latest instead of installing from lock file. See https://getcomposer.org/install for more information.
Loading composer repositories with package information
Updating dependencies
Your requirements could not be resolved to an installable set of packages.
Problem 1
- drupal/core 10.2.x-dev conflicts with drush/drush <12.4.3.
- drupal/core-recommended 10.2.x-dev requires drupal/core 10.2.x-dev -> satisfiable by drupal/core[10.2.x-dev].
- Root composer.json requires drupal/core-recommended 10.2.x-dev -> satisfiable by drupal/core-recommended[10.2.x-dev].
- Root composer.json requires drush/drush ^11.0 -> satisfiable by drush/drush[11.0.0-rc1, ..., 11.x-dev].
@jweowu this occurred for us on 9.5. I have run into this issue before though and when it occurred it was when I was installing field config programmatically in update hooks, which from glancing at the update hook is what is occurring inside 8302:
if (!isset($sandbox['progress'])) {
/** @var \Drupal\Core\Config\ConfigInstallerInterface $configInstaller */
$configInstaller = \Drupal::service('config.installer');
$configInstaller->installDefaultConfig('module', 'password_policy');
$sandbox['progress'] = 0;
$sandbox['max'] = $db_connection->query('SELECT COUNT(uid) FROM {users}')->fetchField();
$sandbox['current_user'] = -1;
}
So I think you're right that this might be a core issue... but I can't help with much more than that unfortunately.
However, what has fixed it for me in the past is just resaving the field storage definition without changing it. That makes the error go away:
$update_manager = \Drupal::entityDefinitionUpdateManager();
$field_storage_definition = $update_manager->getFieldStorageDefinition(
'field_pending_expire_sent',
'user'
);
$update_manager->updateFieldStorageDefinition($field_storage_definition);
Have added a change record and made some adjustments based on the comments and also made the getRow method consistent with regards to type hinting/annotations and the rest of the code. Will wait and see regarding the annotations issue.
Just ran into this as well with queues and static cache with state api. Workaround for those who need it is to use the keyvalue factory service directly which is what the state api is built on top of and doesn't use a static cache. Agree with #5 that it would be good to have a way of busting the cache for a particular key using state.
6 years later and a need for this came up again.
I think I have addressed most of the issues from #19. I added additional tests for everywhere the migration fails and ensure an event is dispatched. In addition, I passed the original exception to the event since in a few cases MigrateExecutable catches the generic Exception event, which means it will catch all exceptions giving classes further up the stack no chance to react to the event. In my case, a source plugin can throw a SuspendQueueException but this was being caught by MigrateExecutable and not the queue runners. So I needed to re-throw it in my event subscriber.
Removed the additional ideas from the issue summary as they are outside the scope of the core issue. I guess they can be opened in new issues if they're still wanted.
Thanks for the MR. Checking if fields have changed does seem like a good idea but I think a clear rather than a re-index will be necessary as alluded to in previous comments. From the Opensearch docs:
If you want to create or add mappings and fields to an index, you can use the put mapping API operation. For an existing mapping, this operation updates the mapping.
You can’t use this operation to update mappings that already map to existing data in the index. You must first create a new index with your desired mappings, and then use the reindex API operation to map all the documents from your old index to the new index. If you don’t want any downtime while you re-index your indexes, you can use aliases.
See docs: https://opensearch.org/docs/latest/api-reference/index-apis/put-mapping/
So if we could detect if a field doesn't yet exist then we could use the re-index flag, and if it does already exist then we would need to clear. However that could get complicated if one field is added and another changed etc.
So just checking if any fields exist in new index vs original and only clearing then already seems like an improvement. However I think the clear would need to come before the call to updateSettings and updateFieldMapping
This also only occurred for me once I installed webprofiler. hook_event_dispatcher was working fine before then. Given that this is the case for others as well maybe this issue should be closed in favour of the related one in tracer?
Also, I'm not entirely sure if the test fails are due to my new code because the tests were already failing on the upstream branch
Based on my understanding that hook is called after all the batches have been queued for processing, so it's too late for my use case. I want to prevent them from being included in the queue in the first place as in my case removing them gets rid of about 20,000 nodes that need to be processed.
Although dispatching the event in EntityUpdateManager
is probably not necessary as in that case the event would already have been fired if we are coming from a batch process, or we are just saving an individual node. So in the latter case I could just use the hook.
I think the module will still have issues with subdomain languages as it just removes the www during comparison but this should fix the language prefix issue.
Just want to clarify my comments from number 7. It should work if you pass the --uri
parameter when running via drush. Cron I'm less certain on but I imagine you can set the correct value somewhere.
achap → created an issue.
Discovered another issue with the derivative approach. It creates a mapping table for each derivative migration, so I think that also won't work as I will have thousands of new tables...
So I quickly realized that this wasn't going to work/be very difficult and the way I was going about it was sending me down a rabbit hole. Stepping back... I realized that I can achieve the same result by using the deriver key in my migration and then passing the product ids I need to generate migration plugins for using the MemoryCache service. That way, each product id has a migration plugin created for it which can be added to a queue. The queue can then process migrations concurrently!
While this approach allows migrations to run concurrently the status becomes useless because the value is stored in key value storage with the id of the migration as the key. So even if you instantiated a migration for each individual item the status would just keep being overridden and not be very helpful. I wonder if there is a way to have a uuid for each instance returned by createInstance?
In the Migration
plugin class these two methods use the plugin id:
/**
* {@inheritdoc}
*/
public function setStatus($status) {
\Drupal::keyValue('migrate_status')->set($this->id(), $status);
}
/**
* {@inheritdoc}
*/
public function getStatus() {
return \Drupal::keyValue('migrate_status')->get($this->id(), static::STATUS_IDLE);
}
Just wanted to say I had the exact same issue as tallytarik. For the most part everything being an array did not affect anything apart from when implementing a knn_vector
field. It throws the same error for a multi value field. I used the IndexParamsEvent
to alter the field to be single value and it works.
Our workaround for the above was to re-architect our migration using the Queue API to process 1 item at a time, and give our queue worker a cron lease time of 1 hour (same as cron run interval). This way the buffer is emptied once per hour at least and it doesn't overwhelm the purge queue. Hope that helps someone.
Just want to chime in and say this issue has affected me too when running migrations. We often run migrations that can take a few hours or more. I stepped through the code and think I discovered what's happening. In Drupal\purge\Plugin\Purge\Queue\QueueService::add
invalidation tags are not added to the queue straight away but rather to an internal buffer. Then at the end of the request (For example a long running cron job or drush script) in Drupal\purge\Plugin\Purge\Queue\QueueService::destruct
it looks like the items from the buffer are finally committed to the queue.
The problem is, during the whole time the migration is running none of the invalidation tags that are generated by the migration can be processed by any of the purge processors. They are all dumped at once at the end of the migration which usually results in being over the 100k limit.
Not sure what the fix is but that seems to be the root cause of the issue at least for us.
No worries I will move this patch into our own codebase :)
Sorry for not replying, got a bit side tracked :D I've been using this patch in production without issues for a while now. In terms of which one should be default I guess something to consider is index size and performance. Don't have any hard data to back this up but I guess tokenizing every character is a lot more expensive than every word. So maybe because of that and also preserving backwards compatibility it makes sense to keep filter as the default and add the tokenizer as a new plugin?
Adds some basic validation if the value is a Point. Not sure this handles all cases but does fix the use case I came across.
Was missing closing parentheses...
It's possible that items were removed even if the queue stayed the same size or new items were added. It would probably be better to compare the original items with the new items, and if any that are in old items are not in new items, dispatch a removed event with those items, and any that are in new items that are not in old items dispatch an added event with those items.
I see what you mean looking at the code now. I guess the same would be true for the pushImport
feed handler as it uses queue api and would have the same lease times. Does that mean I should remove the changes above for cronImport
and pushImport
or should it be left as is?
I'm only using the import
(via drush) and startBatchImport
methods in my project so haven't tested the other methods.
I'm back on the project I had this issue on again (yearly festival site) and have solved my own problem using the Feeds parse event to filter out missing resources. See: https://www.drupal.org/docs/contributed-modules/feeds/feeds-howtos/altering-feed-data →
Regarding #26 happy to re-open the issue but I still think my solution isn't the best one because it's either on or off. So it would solve your issue but it would stop entity references that get imported after the main feed from ever getting referenced. So I think we can do better. With that said, it is opt-in...
Would be interested to know how the Migrate API approaches this.
I have previously played around with those settings on the edge n-gram field before using a custom tokenizer and it didn't appear to do anything but I haven't had a chance to try it out yet for search_as_you_type. I imagine it's caused by the same issue, i.e. that search_as_you_type is probably using the standard tokenizer which splits tokens up on word boundaries rather than individual characters.
This SO question appears to solve it in the same way for the search_as_you_type implementation (implementing an edge n-gram tokenizer) https://stackoverflow.com/questions/59677406/how-do-i-get-elasticsearch-to-highlight-a-partial-word-from-a-search-as-you-type
From the https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-tokenizers.html#analysis-tokenizers it says that a tokenizer is among other things responsible for:
- Order or position of each term (used for phrase and word proximity queries)
- Start and end character offsets of the original word which the term represents (used for highlighting search snippets).
If I analyze a title field that is using the custom edge ngram tokenizer I get the following token information for the sentence "This is a title":
{
"tokens" : [
{
"token" : "t",
"start_offset" : 0,
"end_offset" : 1,
"type" : "word",
"position" : 0
},
{
"token" : "th",
"start_offset" : 0,
"end_offset" : 2,
"type" : "word",
"position" : 1
},
{
"token" : "thi",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 2
},
{
"token" : "this",
"start_offset" : 0,
"end_offset" : 4,
"type" : "word",
"position" : 3
},
{
"token" : "i",
"start_offset" : 5,
"end_offset" : 6,
"type" : "word",
"position" : 4
},
{
"token" : "is",
"start_offset" : 5,
"end_offset" : 7,
"type" : "word",
"position" : 5
},
{
"token" : "a",
"start_offset" : 8,
"end_offset" : 9,
"type" : "word",
"position" : 6
},
{
"token" : "t",
"start_offset" : 10,
"end_offset" : 11,
"type" : "word",
"position" : 7
},
{
"token" : "ti",
"start_offset" : 10,
"end_offset" : 12,
"type" : "word",
"position" : 8
},
{
"token" : "tit",
"start_offset" : 10,
"end_offset" : 13,
"type" : "word",
"position" : 9
},
{
"token" : "titl",
"start_offset" : 10,
"end_offset" : 14,
"type" : "word",
"position" : 10
},
{
"token" : "title",
"start_offset" : 10,
"end_offset" : 15,
"type" : "word",
"position" : 11
}
]
}
If I analyze a search_as_you_type field I get the following information:
{
"tokens" : [
{
"token" : "this",
"start_offset" : 0,
"end_offset" : 4,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "is",
"start_offset" : 5,
"end_offset" : 7,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "a",
"start_offset" : 8,
"end_offset" : 9,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "title",
"start_offset" : 10,
"end_offset" : 15,
"type" : "<ALPHANUM>",
"position" : 3
}
]
}
So if the offset information is used for highlighting that explains why only the edge_ngram_tokenizer is working as expected.
Switching from filter to tokenizer is working for me with Edge N-gram filters. I guess the two plugins can co-exist?
Thanks for putting that together. From what I'm seeing it actually has the same issue as the original edge n-gram implementation, i.e. it's highlighting the entire word rather than the n-grams themselves. Not sure why that is based on the docs https://opensearch.org/docs/latest/search-plugins/searching-data/highlight/
One workaround to this is to create your own plugin similar to the Session one that returns NULL in the getLancode
method if it's an admin path and then falls back to either the URL plugin or the Content Language plugin. This at least makes nodes editable in other languages. But the Session plugin should handle it on its own.
For the comments in #40, #42 and #67 I agree that the original idea of the issue has been forgotten about so I created a new issue to deal specifically with the fact that you can't edit a node if only session negotiation is enabled. https://www.drupal.org/project/drupal/issues/3349579 🐛 Using only Session language negotiation method makes it impossible to edit content other than the current session language Active
achap → created an issue.
Yeah I can do it when I get some free time.
Thanks! I just tried to pull the latest 2.x but the code from the MR issue isn't in there
Updated based on your feedback