- Issue created by @kkaya
- πΊπΈUnited States kkaya
The issue seems to occur when the term description contains a string that is a term name. It is reproducible if I set all ~70 term descriptions to be a term name, but not just a couple of them. Not sure where the threshold or nesting cause is - the terms have up to four levels of hierarchy.
- πΊπΈUnited States TolstoyDotCom L.A.
Does the PHP error log say it ran out of RAM or timed out?
If you do https://www.drupal.org/docs/develop/development-tools/disable-caching β and https://www.drupal.org/docs/develop/development-tools/enable-verbose-err... β and reload the page, is there an error message in the browser?
- πΊπΈUnited States kkaya
Thanks @TolstoyDotCom,
After disabling caching and enabling verbose logging, the messages are the same. The apache log has multiple "[core:notice] [pid ##] AH00051: child pid ## exit signal Segmentation fault (11), possible coredump in /etc/apache2" before timing out and the backtrace repeats this loop:
#0 0x00007f6007fb1146 in pcre2_match_8 () from /lib/x86_64-linux-gnu/libpcre2-8.so.0 #1 0x00007f6008165e7e in php_pcre_replace_impl () from /usr/lib/apache2/modules/libphp8.1.so #2 0x00007f60081663b6 in php_pcre_replace () from /usr/lib/apache2/modules/libphp8.1.so #3 0x00007f60081667f1 in ?? () from /usr/lib/apache2/modules/libphp8.1.so #4 0x00007f600833c9c4 in execute_ex () from /usr/lib/apache2/modules/libphp8.1.so #5 0x00007f60082c0f04 in zend_call_function () from /usr/lib/apache2/modules/libphp8.1.so #6 0x00007f60081f9564 in ?? () from /usr/lib/apache2/modules/libphp8.1.so #7 0x00007f600833c9c4 in execute_ex () from /usr/lib/apache2/modules/libphp8.1.so #8 0x00007f60082c0f04 in zend_call_function () from /usr/lib/apache2/modules/libphp8.1.so #9 0x00007f60081fd003 in ?? () from /usr/lib/apache2/modules/libphp8.1.so ...
- πΊπΈUnited States TolstoyDotCom L.A.
It looks like PHP crashes due to a complex regex.
I'd greatly prefer not to try to duplicate this, so what you can do is set this up: https://www.drupal.org/project/rawdebug β
Then, edit FilterOnomasticon.php and (starting at the end of the file) add output before each regex function:
before line 369 add this:
rawdebug('LINE 369');
before line 347 add this:
rawdebug('LINE 347');
before line 217 add this:
rawdebug('LINE 217');
before line 180 add this:
rawdebug('LINE 180');
before line 150 add this:
rawdebug('LINE 150');
Then, do what you do to cause the crash and look in the rawdebug log to see which regex caused the problem.
- πΊπΈUnited States kkaya
Thank you for your help with debugging. I appreciate your time and this module.
After configuring the rawdebug module, I first get:
ParseError: syntax error, unexpected token ";", expecting ")" in Composer\Autoload\includeFile() (line 219 of onomasticon/src/Plugin/Filter/FilterOnomasticon.php
Then when I move the rawdebug('LINE 217') out of the array map like below, I get a log.log file with "LINE 217" printed 56,100 times. Should I be printing some variable as well?
rawdebug('LINE 217'); $disabledTags = array_map( function($tag) { return preg_replace("/[^a-z1-6]*/", "", strtolower(trim($tag))); }, $disabledTags );
I also tried commenting out the preg_replace and preg_split lines under the rawdebug lines to see if the site would load, but it did not.
- πΊπΈUnited States TolstoyDotCom L.A.
What happens if you remove all of this (together with the rawdebug call right above it)?
$disabledTags = array_map( function($tag) { return preg_replace("/[^a-z1-6]*/", "", strtolower(trim($tag))); }, $disabledTags );
If that doesn't solve it, I'll duplicate your setup. If you can post step-by-step instructions to do that together with the script you referred to that would be helpful.
- πΊπΈUnited States kkaya
Removing those lines didn't resolve the issue. I found that I can reproduce the issue with only two taxonomy terms though, so we can see if you are able to reproduce.
Create two terms, one called "Test" with definition "contains onomasticon word" and the other called "Onomasticon" with the definition "also has onomasticon in it". Then run "drush cr" to clear the cache. After that, editing either of those term definitions (or page content) times out for me. Web pages with the words "test" or "onomasticon", I think would also time out.
- πΊπΈUnited States TolstoyDotCom L.A.
I see the problem. All I have to do is create a term named 'Test' with a description 'Onomasticon', and a term named 'Onomasticon' with a description 'Test'. (The case is important unless you change a setting).
Viewing either of those terms sends the module into an infinite loop where it tries to replace 'Onomasticon' with 'Test', which tries to replace 'Test' with 'Onomasticon', etc etc etc.
For now, avoid such recursive situations. I'll try to make a patch within the next week.
- π©πͺGermany broon Potsdam
Such a recursive situation can be avoided if you enable a text filter _without_ Onomasticon for the description field of your glossary taxonomy (recommended). I am not sure if this module can actually solve such a situation on its own as it is a text filter.
- πΊπΈUnited States kkaya
Thanks for the reply. We use an HTML text filter for our taxonomy description for links and special characters as taxonomy is used in various ways on the site. The 2.1.0 version of the Onomasticon module was able to handle this this without the recursion error though. Looks like we'll need to decide whether we can move to a filter without Onomasticon going forward.
- πΊπΈUnited States TolstoyDotCom L.A.
I didn't see right away how to stop the recursion, but I did a diff between 2.2.1 and 2.1.0 and the only thing that leaps out is the new
foreach ($needles as $n => $needle) {
line. The problem might be from something else, but most of the other changes appear to be stylistic. - π©πͺGermany broon Potsdam
This new foreach loop just accomodates for the different version of the same glossary term (like synonyms or capitalized term name at the beginning of sentences). There is no replacement happening at that point. I am unsure, what exactly is causing this if it wasn't happening before, maybe it was an uncaught (but welcome) error before.
As said above, the text filter is happening on the actual text of a text field. The filter is unaware of the entity that contains the text field. That's why it is not recommended to use Onomasticon filter on the glossary terms as it might result in circular references and ultimately in this issue's problem.
- π¦πΊAustralia cafuego
Getting the same thing here on a glossary page; my work-around is to use an input format without Onomasticon enabled for the vocab descriptions.
Hello,
The problem comes from check_markup function used in FilterOnomasticon. This creates a render array each time a glossary term is found and renders the description field as a processed_text element, which uses a format that can apply Onomasticon filter again, ...
On 2.1.0 version, the filter applied the PHP strip_tags function on the description field value only.- Status changed to Needs review
over 1 year ago 4:25pm 24 August 2023 - π©πͺGermany broon Potsdam
Hey Elena, thanks a lot for the hint, that did slip my attention. I've already made a commit that adds a new option in the filter's settings (disabled by default). When enabled, check_markup will be run.
Please review if the new 2.x-dev version (otherwise identical to 2.2.0) works for you and solves the problem.
- πΊπΈUnited States kkaya
Thanks @broon and @elena.ortegacollado! The 2.x-dev version with new commit that @broon made today worked for me without having to change my taxonomy text filter.
- πΊπΈUnited States david.mcmeans
Version 2.x-dev fixed my "hanging" glossary view without any changes to the text filter using the Onomasticon filter. Prior to this version, the view would timeout.
- Status changed to RTBC
over 1 year ago 8:30am 25 August 2023 - π©πͺGermany broon Potsdam
Alright, I'll take that as RTBC and will publish 2.2.2 soon. Thanks for your feedback!
- Status changed to Fixed
over 1 year ago 8:43am 25 August 2023 Automatically closed - issue fixed for 2 weeks with no activity.