Queries on search admin and node indexing are slow for many-node sites

Created on 23 September 2008, about 16 years ago
Updated 5 August 2024, 3 months ago

Problem/Motivation

I've been looking into using Drupal with millions of nodes. One problem is that the admin/settings/search page is very slow. This tends to cause CPU quota errors on shared hosting.

The query causing the problem is this one:

SELECT COUNT(*) FROM node n LEFT JOIN search_dataset d ON d.type = 'node' AND d.sid = n.nid WHERE n.status = 1 AND (d.sid IS NULL OR d.reindex <> 0)

which has roughly the same problems as the query in node_update_index(), and could be fixed with a similar approach.

Proposed resolution

When a node is added to Drupal, add an entry to the search dataset so that we can avoid this join and the NULL thing and just look for d.reindex <> 0 to find the status.

Also may need to set the d.reindex value to something that will trigger new nodes to be indexed first in NodeSearch::updateIndex, like a value of 1, and the code in that function can also be simplified so it does not have to consider NULL entries.

Remaining tasks

Make a patch.

User interface changes

Faster cron and search status page loading.

API changes

Not really.

πŸ“Œ Task
Status

Needs work

Version

11.0 πŸ”₯

Component
SearchΒ  β†’

Last updated about 20 hours ago

  • Maintained by
  • πŸ‡ΊπŸ‡ΈUnited States @pwolanin
Created by

πŸ‡¨πŸ‡³China Wesley Tanaka

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024