After deleting a translated article, search wants to reindex it

Created on 11 April 2021, almost 4 years ago
Updated 22 July 2024, 6 months ago

Problem/Motivation

If you create a translation of an article in Ukrainian, for example, and then delete it, then on page /admin/config/search/pages we see the following: "There is 1 item left to index."

Running the cron does not solve the problem.

Result:
On page /admin/config/search/pages we see the following: "There is 1 item left to index."
If this is repeated for 100 articles, the indexing of the site will stop.

There is such an entry in the database:
SELECT * FROM `d8_search_dataset`

  • sid = 1
  • langcode = uk
  • type = node_search
  • data = "some text"
  • reindex = 1618155506

Steps to reproduce

1. Install Drupal 8.9.13 (Latest version for now)
2. Install modules (from core): Configuration Translation, Content Translation, Interface Translation,
Language
3. /admin/config/regional/language - Add the Ukrainian language, for example
4. /admin/structure/types/manage/article - "Enable translation"
5. create an article and add a translation to it.
6. Runs Cron
7. Delete article translation.
8. Runs Cron

Proposed resolution

In Node::postSave() execture a new private method nodeSearchRemoveDeletedTranslations that will remove deleted translations from the search index.

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Problem/Motivation

๐Ÿ› Bug report
Status

Fixed

Version

10.4 โœจ

Component
Searchย  โ†’

Last updated 3 days ago

  • Maintained by
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States @pwolanin
Created by

๐Ÿ‡บ๐Ÿ‡ฆUkraine orb

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • ๐Ÿ‡ณ๐Ÿ‡ฟNew Zealand quietone

    I tested on Drupal 11.x and, using the steps in the Issue Summary, I can confirm the problem still exists.

  • ๐Ÿ‡ท๐Ÿ‡บRussia zniki.ru

    zniki.ru โ†’ made their first commit to this issueโ€™s fork.

  • Merge request !5357Draft: Implement test case to show problem โ†’ (Open) created by zniki.ru
  • Pipeline finished with Failed
    about 1 year ago
    Total: 583s
    #48188
  • Status changed to Needs work about 1 year ago
  • ๐Ÿ‡ท๐Ÿ‡บRussia zniki.ru

    Provided tests to display described issue.
    I created draft MR.

  • Pipeline finished with Failed
    9 months ago
    Total: 175s
    #158491
  • First commit to issue fork.
  • Pipeline finished with Success
    9 months ago
    Total: 596s
    #158497
  • Status changed to Needs review 9 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s

    I'm not seeing an option to remove the draft status from MR.

  • Merge request !780411.x โ†’ (Open) created by sukr_s
  • Pipeline finished with Failed
    9 months ago
    Total: 205s
    #158680
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s

    sukr_s โ†’ changed the visibility of the branch 3208247-after-deleting-a to hidden.

  • Pipeline finished with Failed
    9 months ago
    Total: 176s
    #158685
  • Pipeline finished with Success
    9 months ago
    Total: 586s
    #158682
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s

    unable to change the draft status in MR so reverted the changes and created a new mr.

  • Status changed to Needs work 9 months ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    Not sure a hook is the best place for this. There's NodeSearch.php shouldn't that leveraged to handle a deletion. Or does content_translation need a search plugin as well?

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave
  • Status changed to Needs review 9 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s

    When a node is deleted, it's removed from the search index in the node::preDelete event. It's not handled by NodeSearch or content_moderation. On the same lines when a translation is deleted, it must be removed from the search index using the translation_delete hook.

  • Status changed to Needs work 9 months ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    Still think there could be a better way to go about it. But at a minimum believe the change and test should belong in the search module. Not all sites use the search module and even though this hook wouldnโ€™t do anything it will still get called every translation deletion.

  • Status changed to Needs review 9 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s

    w.r.t to alternate approach to solve this, it would be far more expensive to find what was deleted in a cron run e.g. run though list of languages configured for the site, then the list of languages currently available to for the node and then to clear the index for missing languages. So I believe it's best done at the time of deletion.

    I also think that this should be in search module. However this would result in hard coding the type node_search (used in clear method) in search module which would not be right since search module does not own the 'node_search' plugin. I could not find a method to dynamically determine the search plugin type based on entity type. Therefore chose to place it in the node module.

    is it possible to find search plugin type based on entity type or is it ok to hard code the type in search module?

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    Will see if the sub maintainer takes a look in a few weeks.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States pwolanin

    Node module is also responding to various changes in hooks (e.g. comment update) and calling node_reindex_node_search()

    This feels like a similar case, so I think the approach in the MR is reasonable given the limitations of the system.

  • Status changed to RTBC 9 months ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    Thanks @pwolanin for taking a look!

    https://git.drupalcode.org/issue/drupal-3208247/-/jobs/1453892 shows the test coverage

    Applied 1 more nitpicky change but this got sign off from the sub-maintainer so seems fine solution that addresses the problem.

  • Pipeline finished with Failed
    9 months ago
    Total: 4334s
    #167569
  • Status changed to Needs work 9 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom alexpott ๐Ÿ‡ช๐Ÿ‡บ๐ŸŒ

    Seems tests are failing.

  • Pipeline finished with Failed
    9 months ago
    Total: 561s
    #167637
  • Merge request !7966Resolve #3208247 "New branch" โ†’ (Closed) created by smustgrave
  • Pipeline finished with Canceled
    9 months ago
    Total: 76s
    #167665
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    smustgrave โ†’ changed the visibility of the branch 11.x to hidden.

  • Pipeline finished with Success
    9 months ago
    Total: 572s
    #167667
  • Status changed to RTBC 9 months ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    I have 0 idea how the suggestion of :void broke so many tests. But started a new MR as the branch name 11.x wasn't good for local (since that's the main branch). Applied the suggestion by @alexpott and tests are green again.

    Still very weird.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    Sorry @sukr_s @alexpott for all the noise

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom alexpott ๐Ÿ‡ช๐Ÿ‡บ๐ŸŒ

    I think we should open a follow-up to add a post update to clean up the incorrect data. This issue stops the rot but the bogus data hangs about.

  • Pipeline finished with Success
    9 months ago
    Total: 558s
    #169554
  • Status changed to Needs work 9 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    One question on the MR.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Actually two questions.

    Nodes are marked for reindex in Node::postSave()

       // Reindex the node when it is updated. The node is automatically indexed
        // when it is added, simply by being added to the node table.
        if ($update) {
          node_reindex_node_search($this->id());
        }
    

    If we could detect that a translation is being deleted there (can we?) then we would not need to mark the node for indexing, and then delete it again in the hook implementation added here.

    Or... is there somewhere else that has this information instead?

  • Pipeline finished with Success
    9 months ago
    Total: 664s
    #171316
  • Status changed to RTBC 9 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s

    changed to postsave and invoking node_reindex_node_search($this->id()); only if the translation was not deleted.

  • Status changed to Needs work 9 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom alexpott ๐Ÿ‡ช๐Ÿ‡บ๐ŸŒ

    Left a comment on the MR. I think we need to always mark for reindex. The reindexing is not language specific but doing this is not going to recreate any row.

  • Pipeline finished with Success
    9 months ago
    Total: 663s
    #171447
  • Status changed to RTBC 9 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s
  • Pipeline finished with Failed
    8 months ago
    Total: 579s
    #180230
    • catch โ†’ committed 7d50211a on 10.3.x
      Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay Shapovalov, orb...
    • catch โ†’ committed 179e7370 on 10.4.x
      Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay Shapovalov, orb...
    • catch โ†’ committed f90e6a2a on 11.0.x
      Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay Shapovalov, orb...
    • catch โ†’ committed 249039ad on 11.x
      Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay Shapovalov, orb...
  • Status changed to Fixed 8 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Committed/pushed to 11.x and cherry-picked back through to 10.3.x, thanks!

    • catch โ†’ committed 48c9bf36 on 10.3.x
      Revert "Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay...
    • catch โ†’ committed 5ab56052 on 10.4.x
      Revert "Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay...
    • catch โ†’ committed b696c550 on 11.0.x
      Revert "Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay...
    • catch โ†’ committed 6980ddf0 on 11.x
      Revert "Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay...
  • Status changed to Needs work 8 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    I missed a test failure in the last pipeline here, which then broke HEAD, reverted from all four branches.

  • Merge request !8211Resolve #3208247 "New branch" โ†’ (Closed) created by sukr_s
  • Pipeline finished with Success
    8 months ago
    Total: 508s
    #184806
  • Status changed to Needs review 8 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s

    "commit db83ea46 - Better names and no need to pass translations around" broke the fix, which has now been resolved by reverting the relevant changes.

  • Status changed to RTBC 8 months ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    Edit.

    Tests appear green again.

  • ๐Ÿ‡ณ๐Ÿ‡ฟNew Zealand quietone

    I read the issue summary, comments and the MR. The found two things that need attention

    2) This needs a follow up for https://git.drupalcode.org/project/drupal/-/merge_requests/7966#note_310136

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom alexpott ๐Ÿ‡ช๐Ÿ‡บ๐ŸŒ

    @quietone the approach changed after that comment and is no longer relevant.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom alexpott ๐Ÿ‡ช๐Ÿ‡บ๐ŸŒ

    I've moved the private function to be inline - keeping old translation objects around after \Drupal\Core\Entity\ContentEntityBase::postSave feels wrong.

  • Status changed to Needs review 8 months ago
  • ๐Ÿ‡ฆ๐Ÿ‡บAustralia larowlan ๐Ÿ‡ฆ๐Ÿ‡บ๐Ÿ.au GMT+10

    Left a comment on the MR - is there really no way to do this from search module or similar, putting it in a specific entity class feels wrong - what about other translatable nodes?

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    @larowlan all the search integration for nodes is in node module, like the NodeSearch plugin. I agree it would be much nicer if search module had entity support which was then enabled for nodes by default, but for now this should stay with the rest of the integration.

  • Status changed to RTBC 7 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom alexpott ๐Ÿ‡ช๐Ÿ‡บ๐ŸŒ

    I think catch's answer in #57 menas this can go back to rtbc.

  • ๐Ÿ‡ณ๐Ÿ‡ฟNew Zealand quietone

    I tested this on 11.x today and found that this does remove the deleted item from the search index, but only after a cron run. It still adds it to the search index, so that the search page shows 98% of the site has been indexed. There is 1 item left to index. after the deletion. That is incorrect. And not what I expected from reading the issue title and summary. I expected search to not want to re-index it.

    There are the steps I used with the diff applied.

    1. admin/config/search/pages - 100% of the site has been indexed. There are 0 items left to index.
    2. Create article
    3. admin/config/search/pages - 98% of the site has been indexed. There is 1 item left to index.
    4. Run cron
    5. admin/config/search/pages - 100% of the site has been indexed. There are 0 items left to index.
    6. Add translation
    7. admin/config/search/pages - 98% of the site has been indexed. There is 1 item left to index.
    8. Run cron
    9. admin/config/search/pages - 100% of the site has been indexed. There are 0 items left to index.
    10. Delete translation
    11. admin/config/search/pages - 98% of the site has been indexed. There is 1 item left to index.
    12. Run cron
    13. admin/config/search/pages - 100% of the site has been indexed. There are 0 items left to index.

    If my testing is confirmed correct then it looks like a follow up is needed for correcting the information on /admin/config/search/pages. Or have I misunderstood something again on this issue?

  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia sukr_s

    When any translation is removed, the original content is also added for index. This behaviour has not changed. Pervious to this fix, after the cron run the original article would get reindexed but the removed translation would be left hanging and still gets reported as "1 item requires indexing".
    After this fix, the translation is removed from the index therefore before the cron run it shows 1 item which is the original article and after the cron run, it's zero items.

    • catch โ†’ committed 9db42462 on 10.4.x
      Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay Shapovalov,...
    • catch โ†’ committed 9ed95566 on 11.x
      Issue #3208247 by sukr_s, smustgrave, alexpott, Nikolay Shapovalov,...
  • Status changed to Fixed 7 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    #60 makes sense to me, the entity will be re-indexed but the translation won't.

    Since this caused a regression before, committed to 11.1.x and cherry-picked to 10.4.x, thanks!

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024