Upgrade to 2.2.1 crashes site

Created on 21 July 2023, over 1 year ago
Updated 25 August 2023, about 1 year ago

Problem/Motivation

Upgrading from the 2.1.0 version of Onomasticon to the 2.2.1 version causes our Drupal 9.5 site (PHP 8.1) to hang. Tried changing to a simpler theme to see if theme issue. Only thing that works is reverting to the 2.1.0 version of the module or deleting all taxonomy. This is critical for us now that 2.1.0 is marked as unsupported.

Steps to reproduce

Possibly some issue with term description during the Onomasticon processing. I wrote a script to set the description of all my terms to a simple string. When that string doesn't match a term name, then the site loads. When that string matches a term name, then the site doesn't load.

πŸ› Bug report
Status

Fixed

Version

2.2

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States kkaya

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @kkaya
  • πŸ‡ΊπŸ‡ΈUnited States kkaya

    The issue seems to occur when the term description contains a string that is a term name. It is reproducible if I set all ~70 term descriptions to be a term name, but not just a couple of them. Not sure where the threshold or nesting cause is - the terms have up to four levels of hierarchy.

  • πŸ‡ΊπŸ‡ΈUnited States TolstoyDotCom L.A.

    Does the PHP error log say it ran out of RAM or timed out?

    If you do https://www.drupal.org/docs/develop/development-tools/disable-caching β†’ and https://www.drupal.org/docs/develop/development-tools/enable-verbose-err... β†’ and reload the page, is there an error message in the browser?

  • πŸ‡ΊπŸ‡ΈUnited States kkaya

    Thanks @TolstoyDotCom,

    After disabling caching and enabling verbose logging, the messages are the same. The apache log has multiple "[core:notice] [pid ##] AH00051: child pid ## exit signal Segmentation fault (11), possible coredump in /etc/apache2" before timing out and the backtrace repeats this loop:

    #0  0x00007f6007fb1146 in pcre2_match_8 () from /lib/x86_64-linux-gnu/libpcre2-8.so.0
    #1  0x00007f6008165e7e in php_pcre_replace_impl () from /usr/lib/apache2/modules/libphp8.1.so
    #2  0x00007f60081663b6 in php_pcre_replace () from /usr/lib/apache2/modules/libphp8.1.so
    #3  0x00007f60081667f1 in ?? () from /usr/lib/apache2/modules/libphp8.1.so
    #4  0x00007f600833c9c4 in execute_ex () from /usr/lib/apache2/modules/libphp8.1.so
    #5  0x00007f60082c0f04 in zend_call_function () from /usr/lib/apache2/modules/libphp8.1.so
    #6  0x00007f60081f9564 in ?? () from /usr/lib/apache2/modules/libphp8.1.so
    #7  0x00007f600833c9c4 in execute_ex () from /usr/lib/apache2/modules/libphp8.1.so
    #8  0x00007f60082c0f04 in zend_call_function () from /usr/lib/apache2/modules/libphp8.1.so
    #9  0x00007f60081fd003 in ?? () from /usr/lib/apache2/modules/libphp8.1.so
    ...
    
  • πŸ‡ΊπŸ‡ΈUnited States kkaya
  • πŸ‡ΊπŸ‡ΈUnited States TolstoyDotCom L.A.

    It looks like PHP crashes due to a complex regex.

    I'd greatly prefer not to try to duplicate this, so what you can do is set this up: https://www.drupal.org/project/rawdebug β†’

    Then, edit FilterOnomasticon.php and (starting at the end of the file) add output before each regex function:

    before line 369 add this:
    rawdebug('LINE 369');

    before line 347 add this:
    rawdebug('LINE 347');

    before line 217 add this:
    rawdebug('LINE 217');

    before line 180 add this:
    rawdebug('LINE 180');

    before line 150 add this:
    rawdebug('LINE 150');

    Then, do what you do to cause the crash and look in the rawdebug log to see which regex caused the problem.

  • πŸ‡ΊπŸ‡ΈUnited States kkaya

    Thank you for your help with debugging. I appreciate your time and this module.

    After configuring the rawdebug module, I first get:
    ParseError: syntax error, unexpected token ";", expecting ")" in Composer\Autoload\includeFile() (line 219 of onomasticon/src/Plugin/Filter/FilterOnomasticon.php

    Then when I move the rawdebug('LINE 217') out of the array map like below, I get a log.log file with "LINE 217" printed 56,100 times. Should I be printing some variable as well?

            rawdebug('LINE 217');
            $disabledTags = array_map(
              function($tag) { return preg_replace("/[^a-z1-6]*/", "", strtolower(trim($tag))); },
              $disabledTags
            );

    I also tried commenting out the preg_replace and preg_split lines under the rawdebug lines to see if the site would load, but it did not.

  • πŸ‡ΊπŸ‡ΈUnited States TolstoyDotCom L.A.

    What happens if you remove all of this (together with the rawdebug call right above it)?

            $disabledTags = array_map(
              function($tag) { return preg_replace("/[^a-z1-6]*/", "", strtolower(trim($tag))); },
              $disabledTags
            );
    

    If that doesn't solve it, I'll duplicate your setup. If you can post step-by-step instructions to do that together with the script you referred to that would be helpful.

  • πŸ‡ΊπŸ‡ΈUnited States kkaya

    Removing those lines didn't resolve the issue. I found that I can reproduce the issue with only two taxonomy terms though, so we can see if you are able to reproduce.

    Create two terms, one called "Test" with definition "contains onomasticon word" and the other called "Onomasticon" with the definition "also has onomasticon in it". Then run "drush cr" to clear the cache. After that, editing either of those term definitions (or page content) times out for me. Web pages with the words "test" or "onomasticon", I think would also time out.

  • πŸ‡ΊπŸ‡ΈUnited States TolstoyDotCom L.A.

    I see the problem. All I have to do is create a term named 'Test' with a description 'Onomasticon', and a term named 'Onomasticon' with a description 'Test'. (The case is important unless you change a setting).

    Viewing either of those terms sends the module into an infinite loop where it tries to replace 'Onomasticon' with 'Test', which tries to replace 'Test' with 'Onomasticon', etc etc etc.

    For now, avoid such recursive situations. I'll try to make a patch within the next week.

  • πŸ‡©πŸ‡ͺGermany broon Potsdam

    Such a recursive situation can be avoided if you enable a text filter _without_ Onomasticon for the description field of your glossary taxonomy (recommended). I am not sure if this module can actually solve such a situation on its own as it is a text filter.

  • πŸ‡ΊπŸ‡ΈUnited States kkaya

    Thanks for the reply. We use an HTML text filter for our taxonomy description for links and special characters as taxonomy is used in various ways on the site. The 2.1.0 version of the Onomasticon module was able to handle this this without the recursion error though. Looks like we'll need to decide whether we can move to a filter without Onomasticon going forward.

  • πŸ‡ΊπŸ‡ΈUnited States TolstoyDotCom L.A.

    I didn't see right away how to stop the recursion, but I did a diff between 2.2.1 and 2.1.0 and the only thing that leaps out is the new foreach ($needles as $n => $needle) { line. The problem might be from something else, but most of the other changes appear to be stylistic.

  • πŸ‡©πŸ‡ͺGermany broon Potsdam

    This new foreach loop just accomodates for the different version of the same glossary term (like synonyms or capitalized term name at the beginning of sentences). There is no replacement happening at that point. I am unsure, what exactly is causing this if it wasn't happening before, maybe it was an uncaught (but welcome) error before.

    As said above, the text filter is happening on the actual text of a text field. The filter is unaware of the entity that contains the text field. That's why it is not recommended to use Onomasticon filter on the glossary terms as it might result in circular references and ultimately in this issue's problem.

  • πŸ‡¦πŸ‡ΊAustralia cafuego

    Getting the same thing here on a glossary page; my work-around is to use an input format without Onomasticon enabled for the vocab descriptions.

  • Hello,
    The problem comes from check_markup function used in FilterOnomasticon. This creates a render array each time a glossary term is found and renders the description field as a processed_text element, which uses a format that can apply Onomasticon filter again, ...
    On 2.1.0 version, the filter applied the PHP strip_tags function on the description field value only.

    • broon β†’ committed 2a7574fe on 2.x
      Issue #3376245: Upgrade to 2.2.1 crashes site. Add new option for...
  • Status changed to Needs review about 1 year ago
  • πŸ‡©πŸ‡ͺGermany broon Potsdam

    Hey Elena, thanks a lot for the hint, that did slip my attention. I've already made a commit that adds a new option in the filter's settings (disabled by default). When enabled, check_markup will be run.

    Please review if the new 2.x-dev version (otherwise identical to 2.2.0) works for you and solves the problem.

  • πŸ‡ΊπŸ‡ΈUnited States kkaya

    Thanks @broon and @elena.ortegacollado! The 2.x-dev version with new commit that @broon made today worked for me without having to change my taxonomy text filter.

  • πŸ‡ΊπŸ‡ΈUnited States david.mcmeans

    Version 2.x-dev fixed my "hanging" glossary view without any changes to the text filter using the Onomasticon filter. Prior to this version, the view would timeout.

  • Status changed to RTBC about 1 year ago
  • πŸ‡©πŸ‡ͺGermany broon Potsdam

    Alright, I'll take that as RTBC and will publish 2.2.2 soon. Thanks for your feedback!

  • πŸ‡©πŸ‡ͺGermany broon Potsdam
  • Status changed to Fixed about 1 year ago
  • πŸ‡©πŸ‡ͺGermany broon Potsdam
  • πŸ‡©πŸ‡ͺGermany broon Potsdam
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024