[meta] Add database driver for MongoDB to Core as experimental

Created on 27 June 2024, 2 months ago
Updated 26 August 2024, 13 days ago

Problem/Motivation

The main problem for why Drupal is slow is, because of getting entity data out of the database is slow. Entity data for a single entity instance is stored in many different database tables. Getting a single entity instance out of the database will result in a complicated join query. The relational database storage used by Drupal will result in getting entity data out of the database always being slow.

Conditions and sorts also often have to be run against multiple different tables, which works against database queries. If we were to try to store fields more in multiple columns, we'd also run into index limitations due to sheer index length.

Proposed resolution

Add the database driver for MongoDB as experimental to Drupal Core. All entity instance data is stored in a single JSON document. Getting a single entity instance out of the database is always a very simple query. A single row from a single database table. The same as a keyvalue store and just as fast. The database driver is a full support database driver. Only a MongoDB database is used by Drupal. No other and/or relational database is necessary. Just MongoDB.

The database driver has a contrib module. The code with a readme on how to install Drupal on MongoDB on DDEV can be found here

Remaining tasks

Hard requirements:

Not hard requirements:

Minimum requirements for MongoDB

- The minimum required version for MongoDB is 7.0. This is the most current version of MongoDB.
- A MongoDB replica set is required. MongoDB with a replica set is the minimum for transaction support. A single MongoDB instance does not support transactions. Drupal needs database transactions to do what it needs to do.

API changes

None

Data model changes

Entity instances are stored in JSON documents for MongoDB.

Release notes snippet

TBD

Feature request
Status

Active

Version

11.0 🔥

Component
Database 

Last updated less than a minute ago

  • Maintained by
  • 🇳🇱Netherlands @daffie
Created by

🇳🇱Netherlands daffie

Live updates comments and jobs are added and updated live.
  • Needs product manager review

    It is used to alert the product manager core committer(s) that an issue represents a significant new feature, UI change, or change to the "user experience" of Drupal, and their signoff is needed. If an issue significantly affects the usability of Drupal, use Needs usability review instead (see the governance policy draft for more information).

  • Needs framework manager review

    It is used to alert the framework manager core committer(s) that an issue significantly impacts (or has the potential to impact) multiple subsystems or represents a significant change or addition in architecture or public APIs, and their signoff is needed (see the governance policy draft for more information). If an issue significantly impacts only one subsystem, use Needs subsystem maintainer review instead, and make sure the issue component is set to the correct subsystem.

Sign in to follow issues

Comments & Activities

  • Issue created by @daffie
  • 🇬🇧United Kingdom alexpott 🇪🇺🌍
  • 🇫🇮Finland lauriii Finland
  • 🇬🇧United Kingdom longwave UK
  • 🇬🇧United Kingdom catch

    Adding a note about listing queries to the issue summary since that's also a severe performance/scalability issue now that mongodb can significantly help with especially with large datasets.

    For views, at devdays we discussed adding an entity query views backend, and converting all core (and starshot once it has them) shipped views to use it. An entity query view will be interoperable between mongodb and relational database drivers. Some previous work/discussion is in https://www.drupal.org/project/efq_views and Entity Query views backend Active .

    I think we could add mongodb as alpha/beta to core without resolving the views issue, but would need to fix it to make mongodb stable. However an entity query backend for views has a lot of positives in its own right, not just for mongodb.

  • 🇫🇮Finland lauriii Finland

    I've discussed the idea with @daffie and @catch in detail at DrupalCon Lille and Drupal Dev Days and on a high level the proposal is fine. Adding a new database driver is not part of the strategic focus for the team at the moment, which may lead into some delays with reviews in cases where there's higher priority work waiting for feedback from committers.

    I'm leaving the tag on because I'd like to be involved in the process again when we have a better sense on what's the impact on UX (such as Views) and the ecosystem.

  • 🇩🇪Germany mxh Offenburg

    The main problem for why Drupal is slow is, because of getting entity data out of the database is slow.

    Is that statement backed by any performance numbers / benchmarks?

    What I've often seen as a performance bottleneck is not the database itself, but the connection to it. When a Drupal site grows (showing in its number of configuration items), one request to a Drupal page may take up hundreds, sometimes over one thousand of database queries. This is because of cache tag lookups, config reads and finally when content entities are queried. When having a local database without any latency, performance is usually not a problem.

  • 🇬🇧United Kingdom catch

    @mxhcache tag lookups can be replaced with redis so they won't hit the database at all.

    On a site with lots of data, entity listing queries can individually take a very long time - anything from hundreds of milliseconds to dozens of seconds for a single query if it has conditions and sorts on different database tables.

    It's less that it's the main reason that Drupal is slow, since sites with much less data have different performance issue, and more that it's one of the hardest issues to address.

  • 🇩🇪Germany mxh Offenburg

    @mxhcache tag lookups can be replaced with redis so they won't hit the database at all.

    Thanks for the reply and tip @catch using Redis might actually an option to try out. My key point in #9 is that it doesn't make sense to replace a Porsche with a Ferrari for getting faster when the speed limit is at 50 on a one-lane bridge. I'm sorry that I might have brought in something off-top here - will move it into a separate issue in case there's more room for discussion on it.

Production build 0.71.5 2024