Segmentation fault during cron

Created on 20 November 2024, 1 day ago

Problem/Motivation

Google Counter seems to be causing an error with cron on our site. I'm not able to reproduce locally due to it seems to only be a problem on our Acquia hosted sites, likely memory related issue I assume?

drush php-eval 'module_load_include("module", "google_analytics_counter"); google_analytics_counter_cron();'
Segmentation fault

I tried to Lower the number of items to fetch from 10,000 to 1,000 and set the Update pageviews for the last X content to 500 (prev unset) but still not able to get cron to run with those settings.

Here is a debug of my cron, and does seem to increase the memory

[info] Starting execution of google_analytics_counter_cron() took 983.3ms. [4.08 sec, 27.57 MB]
[info] Retrieved 7663 items from Google Analytics data for paths 1 - 7663. [34.14 sec, 110.51 MB]
[info] Merged 7662 paths from Google Analytics into the database. [37.6 sec, 110.85 MB]

Here is our status page to gauge the size of our analytics processing

Steps to reproduce

Have not been able to reproduce locally yet, so I know people won't be of much help with this, but mainly creating this issue for others to stumble upon if in the same situation.

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

πŸ› Bug report
Status

Active

Version

4.0

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States NicholasS

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @NicholasS
  • πŸ‡ΊπŸ‡ΈUnited States NicholasS

    I think this could be the at fault code, where its working with a large array in memory.
    https://git.drupalcode.org/project/google_analytics_counter/-/blob/4.0.x...

  • πŸ‡ΈπŸ‡°Slovakia kaszarobert

    I see on the screenshot that it downloaded quite a few paths someday. Did it work on the same production environment up until now? Does this crash happen constantly or just randomly? Indeed, it seems a memory related issue. But the cause of that is what's gonna be hard to tell. Here are a few ideas I'd check now:

    Some limit is not enough in php.ini?
    Out of physical memory?
    One of the installed PHP extensions?
    Or the A weird APCu bug?
    Could the google/analytics-data library which this module uses cause some trouble with certain server configurations?

    The debug cron logging pasted here shows that it used at some point 110.85 MB RAM which I think not that big number considering how many things the cron job is downloading and writing to db. I'd look into phpinfo on the webserver and try to compare the php ini settings with the other (probably dev) environment where you stated that this problem does not occurs if there are some major differences in some memory or cache limits.

  • πŸ‡ΊπŸ‡ΈUnited States NicholasS

    An update, our host PHP memory limit is 128mb.

    @kaszarabert Thanks for the ideas and I will use them to investigate more, It stopped working across all our Acquia environments, and they actually had different code deployed, so I think it could be an issue with the amount of data being retrieved from Google is my current theory. Since nothing on the servers changed at the point in time when things broke, at least that I have found yet.

    Will research more and update tomorrow.

  • Merge request !21WIP so I can patch for testing β†’ (Open) created by NicholasS
  • Pipeline finished with Success
    1 day ago
    Total: 183s
    #345217
  • πŸ‡ΊπŸ‡ΈUnited States NicholasS

    Still debugging, added a bunch of logging to the current cron logic, looks like the problem lies in the queryTotalPaths()

    [info] Starting execution of google_analytics_counter_cron(), execution of file_cron() took 4.12ms. [19.61 sec, 64.75 MB]
     [notice] Starting cron with memory: 71827456 [19.62 sec, 64.82 MB]
     [notice] Before node count query [19.62 sec, 64.83 MB]
     [notice] Total nodes: 16481. Memory: 71827456 [19.63 sec, 64.92 MB]
     [notice] Starting main processing. Memory: 71827456 [19.63 sec, 64.92 MB]
     [notice] Before database truncate. Memory: 71827456 [19.64 sec, 64.92 MB]
     [notice] After database truncate. Memory: 71827456 [20.23 sec, 64.99 MB]
     [notice] Before queryTotalPaths. Memory: 71827456 [20.23 sec, 64.99 MB]
    Segmentation fault
    
  • πŸ‡ΊπŸ‡ΈUnited States NicholasS

    The problem seems to lie with the serialize() in the buildQuery method, not really sure where I go from here to debug further....

     [info] Starting execution of google_analytics_counter_cron(), execution of file_cron() took 2.59ms. [19.88 sec, 64.77 MB]
     [notice] Starting cron with memory: 71827456 [19.89 sec, 64.86 MB]
     [notice] Before node count query [19.89 sec, 64.86 MB]
     [notice] Total nodes: 16480. Memory: 71827456 [19.9 sec, 64.95 MB]
     [notice] Starting main processing. Memory: 71827456 [19.9 sec, 64.95 MB]
     [notice] Before database truncate. Memory: 71827456 [19.91 sec, 64.95 MB]
     [notice] After database truncate. Memory: 71827456 [20.52 sec, 65.02 MB]
     [notice] Before queryTotalPaths. Memory: 71827456 [20.52 sec, 65.02 MB]
     [notice] Starting queryTotalPaths [20.52 sec, 65.02 MB]
     [notice] About to buildQuery [20.53 sec, 65.02 MB]
     [notice] Starting buildQuery [20.53 sec, 65.02 MB]
     [notice] Got config [20.53 sec, 65.02 MB]
     [notice] Chunk: 1, Pointer: 0 [20.54 sec, 65.02 MB]
     [notice] Built query dates: Array
    (
        [start] => 2024-10-22
        [end] => 2024-11-20
    )
     [20.54 sec, 65.02 MB]
     [notice] Property ID: 371690843 [20.54 sec, 65.02 MB]
     [notice] Starting cache options creation [20.55 sec, 65.15 MB]
     [notice] Params structure: Array
    (
        [property] => properties/371XXXX43
        [dateRanges] => Array
            (
                [0] => Google\Analytics\Data\V1beta\DateRange Object
    
            )
    
        [dimensions] => Array
            (
                [0] => Google\Analytics\Data\V1beta\Dimension Object
    
            )
    
        [metrics] => Array
            (
                [0] => Google\Analytics\Data\V1beta\Metric Object
    
            )
    
        [offset] => 0
        [limit] => 1
    )
     [20.55 sec, 65.15 MB]
     [notice] Before serialize [20.55 sec, 65.15 MB]
    Segmentation fault
    

  • πŸ‡ΊπŸ‡ΈUnited States NicholasS

    PROBLEM:

    Had a segmentation fault when trying to serialize() Google Analytics API objects (DateRange, Dimension, Metric)
    These objects can't be safely serialized in PHP, causing the crash

    SOLUTION:

    Create two separate parameter arrays:

    $parameters: Contains proper Google Analytics objects needed for the API call
    $cache_params: A simple array with basic data types that can be safely serialized

    WHY:

    We only need serialization to create a unique cache ID (md5 hash)
    The actual API call requires proper Google Analytics objects
    By separating these concerns, we avoid the segfault while maintaining functionality

    So going to make that proposed change, Claude.ai helped debug the issue.

  • Pipeline finished with Success
    about 8 hours ago
    Total: 285s
    #346309
  • πŸ‡ΊπŸ‡ΈUnited States NicholasS

    UGH now getting

     [info] Starting execution of google_analytics_counter_cron(), execution of file_cron() took 3.68ms. [21.11 sec, 64.77 MB]
    WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
    I0000 00:00:1732220684.750398     339 call_credentials.c:168] GRPC_PHP: call credentials plugin function - begin
    I0000 00:00:1732220684.753480     339 call_credentials.c:171] GRPC_PHP: call credentials plugin function - end
    Segmentation fault
    
Production build 0.71.5 2024