Fix SqlBase::initializeIterator()'s cross-database joining when using particular DB/table names

Created on 9 August 2021, almost 3 years ago
Updated 31 December 2023, 5 months ago

Problem/Motivation

TL;DR The problem occurs when the mapping table name has either a hyphen in the name or a name that has only digits or a name that has a digit at the start and an "e".

@joachim noticed this problem 2.5 years ago in #2986452-79: Database reserved keywords need to be quoted as per the ANSI standard β†’ .

He explained it in more detail in subsequent comments.

It was suggested in #2986452-83: Database reserved keywords need to be quoted as per the ANSI standard β†’ that the problem was the hyphen in the database name, for which #2767595: Support "-" in database table names β†’ existed. But that's not the case. The problem is simply that the necessary quoting of identifiers is missing.

I traced the problem further over at #3218978-11: MySQL driver allows settings.php to remove ANSI_QUOTES from sql_mode, but doesn't work when it is β†’ , quoting in full here:

Seems like this problem is even worse when database or tables names consists solely of digits:

Certain objects within MySQL, including database, table, index, column, alias, view, stored procedure, partition, tablespace, resource group and other object names are known as identifiers. This section describes the permissible syntax for identifiers in MySQL.

[…]

Identifiers may begin with a digit but unless quoted may not consist solely of digits.

[…]

The identifier quote character is the backtick (`):

[…]

If the ANSI_QUOTES SQL mode is enabled, it is also permissible to quote identifiers within double quotation marks:

β€” https://dev.mysql.com/doc/refman/8.0/en/identifiers.html

IOW: On MySQL, you should always quote identifiers, either using " when ANSI_QUOTES is on, otherwise using `.

That makes this at least major IMHO, bordering on critical. Any site that turns off ANSI_QUOTES which uses a fully numerical database name will get a SQL error, and fail to work.

Worse, counter to the docs, I can confirm that even database names that just start with a digit reproduce this behavior:

  1. SELECT * FROM d8.users u INNER JOIN 101.foo f ON f.id = u.uid
    

    β‡’ You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '101.foo f ON f.id = u.uid' at line 1

  2. SELECT * FROM d8.users u INNER JOIN 73829e0c4090444f85fbae2e92f8925a.foo f ON f.id = u.uid
    

    β‡’ You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '73829e0c4090444f85fbae2e92f8925a.foo f ON f.id = u.uid' at line 1

  3. Turns out this is because there's an e (the letter "e") in the database name, which makes MySQL interpret it as a number in exponential notation! πŸ€ͺ🀯🀯🀯🀯🀯🀯🀯🀯🀯🀯 Check yourself:
    SELECT * FROM d8.users u INNER JOIN 1e0.foo f ON f.id = u.uid
    

    β‡’ You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '1e0.foo f ON f.id = u.uid' at line 1
    … yet 1a0 works fine!

To reproduce:

  1. create three database names above
  2. create an empty foo table like so in each:
    CREATE TABLE `foo` (
      `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
    

⚠️ Databases named using only digits (unlikely) named with a hash (kinda likely) that happen to contain the letter "e" (unlikely) will fail hard! ⚠️

So this problem is not the hyphen, the problem is much simpler:

  1. all tests in Drupal core only use simple database names that do not use database or table names that need escaping in the first place
  2. the necessary quoting of "identifiers" is obviously missing

This is very clear in @joachim's #2986452-86: Database reserved keywords need to be quoted as per the ANSI standard β†’ , where he cites this:

SELECT m.id AS id, m.title AS title, m.changed AS changed, map.sourceid1 AS migrate_map_sourceid1, map.source_row_status AS migrate_map_source_row_status
FROM
{high_water_node} m
LEFT OUTER JOIN drupalaccounts8.test55734627migrate_map_high_water_test map ON id = map.sourceid1
WHERE (map.sourceid1 IS NULL) OR (map.source_row_status = :db_condition_placeholder_0) OR (changed > :db_condition_placeholder_1)
GROUP BY id, title, changed, map.sourceid1, map.source_row_status
ORDER BY changed ASC

(before that patch)
and

SELECT "m"."id" AS "id", "m"."title" AS "title", "m"."changed" AS "changed", "map"."sourceid1" AS "migrate_map_sourceid1", "map"."source_row_status" AS "migrate_map_source_row_status"
FROM
{high_water_node} "m"
LEFT OUTER JOIN drupalaccounts8.test77520843migrate_map_high_water_test "map" ON id = map.sourceid1
WHERE ("map"."sourceid1" IS NULL) OR ("map"."source_row_status" = :db_condition_placeholder_0) OR ("changed" > :db_condition_placeholder_1)
GROUP BY id, title, changed, map.sourceid1, map.source_row_status
ORDER BY "changed" ASC

A proper follow-up was never filed. This is then that follow-up.

Any migration that stores the mapping table in a database with either a hyphen in the name or a name that has only digits or a name that has a digit at the start and an "e" somewhere will encounter this problem, both before and after #2986452. But since #2986452, the ANSI mode being turned on causes it to fail 100% of the time.

Steps to reproduce

See above. Easier to see:

Do the following on a new DB called d9-destination (dash: not allowed):

-- Simulate users 0 and 1 have been created as part of the default Drupal 9 installation.
CREATE TABLE `users` (
  `uid` int(10) unsigned NOT NULL,
  `uuid` varchar(128) CHARACTER SET ascii NOT NULL,
  `langcode` varchar(12) CHARACTER SET ascii NOT NULL,
  PRIMARY KEY (`uid`),
  UNIQUE KEY `user_field__uuid__value` (`uuid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='The base table for user entities.';
INSERT INTO `users` (`uid`, `uuid`, `langcode`)
VALUES
	(0,'f214b7ad-94fb-4202-bd68-a6d4d1119c27','en'),
	(1,'a500cd13-6aac-485d-8454-14e035b4b61e','en');

-- Simulate the migration system being active; but no migrations having run yet.
CREATE TABLE `migrate_map_d7_users` (
  `source_ids_hash` varchar(64) NOT NULL COMMENT 'Hash of source ids. Used as primary key',
  `sourceid1` int(11) NOT NULL,
  `destid1` int(10) unsigned DEFAULT NULL,
  `source_row_status` tinyint(3) unsigned NOT NULL DEFAULT 0 COMMENT 'Indicates current status of the source row',
  `rollback_action` tinyint(3) unsigned NOT NULL DEFAULT 0 COMMENT 'Flag indicating what to do for this item on rollback',
  `last_imported` int(10) unsigned NOT NULL DEFAULT 0 COMMENT 'UNIX timestamp of the last time this row was imported',
  `hash` varchar(64) DEFAULT NULL COMMENT 'Hash of source row data, for detecting changes',
  PRIMARY KEY (`source_ids_hash`),
  KEY `source` (`sourceid1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Mappings from source identifier value(s) to destination…';

-- Match \Drupal\Core\Database\Driver\mysql\Connection::open()
SET SQL_MODE='ANSI,TRADITIONAL';

-- Use the DB query Drupal 9 HEAD generates (but simplified slightly, to focus on the problem at hand).
SELECT "m"."uid" AS "id", "map"."sourceid1" AS "migrate_map_sourceid1", "map"."source_row_status" AS "migrate_map_source_row_status"
FROM
users "m"
LEFT OUTER JOIN d9-destination.migrate_map_d7_users "map" ON m.uid = map.sourceid1
WHERE ("map"."sourceid1" IS NULL) OR ("map"."source_row_status" = 1);

-- Result: ❌ You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '-destination.migrate_map_d7_users "map" ON m.uid = map.sourceid1
WHERE ("map"."s' at line 4

-- Use the DB query that is generated with the patch in #3 applied
SELECT "m"."uid" AS "id", "map"."sourceid1" AS "migrate_map_sourceid1", "map"."source_row_status" AS "migrate_map_source_row_status"
FROM
users "m"
LEFT OUTER JOIN "d9-destination"."migrate_map_d7_users" "map" ON "m"."uid" = "map"."sourceid1"
WHERE ("map"."sourceid1" IS NULL) OR ("map"."source_row_status" = 1)

-- Result: `1 NULL NULL` and `0 NULL NULL`, which is expected βœ…

Rename the database from d9-destination to 1a1 and update the queries. Both queries will now work.

Rename the database from 1a1 to 1e1 and update the queries. Now once again only the second query will work.

Proposed resolution

Ensure cross-database joins are correctly escaped.

.

Remaining tasks

Review

User interface changes

None.

API changes

TBD

Data model changes

None.

Release notes snippet

TBD

πŸ› Bug report
Status

Needs work

Version

11.0 πŸ”₯

Component
MigrationΒ  β†’

Last updated about 2 hours ago

Created by

πŸ‡§πŸ‡ͺBelgium Wim Leers Ghent πŸ‡§πŸ‡ͺπŸ‡ͺπŸ‡Ί

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡¬πŸ‡§United Kingdom joachim

    I tried applying the latest patch to 10.0 and it doesn't fix the problem -- running a migration from D7 in the UI shows lots of:

    > Migration failed with source plugin exception: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'drupalfirecore10.migrate_map_d7_action' doesn't exist: SELECT "a".*, "map"."sourceid1" AS "migrate_map_sourceid1", "map"."source_row_status" AS "migrate_map_source_row_status" FROM "actions" "a" LEFT OUTER JOIN drupalfirecore10.migrate_map_d7_action "map" ON aid = map.sourceid1 WHERE ("map"."sourceid1" IS NULL) OR ("map"."source_row_status" = :db_condition_placeholder_0); Array ( [:db_condition_placeholder_0] => 1 ) in /Users/joachim/Sites/firecore/web/core/modules/mysql/src/Driver/Database/mysql/ExceptionHandler.php line 46

    My DB is called drupal-firecore-10, not drupalfirecore10!

  • πŸ‡¬πŸ‡§United Kingdom joachim

    I've received this suggestion as a fix:

      public function escapeTable($table) {
        if (!isset($this->escapedTables[$table])) {
          $database = $this->getConnectionOptions()['database'];
          if (str_starts_with($table, "$database.")) {
            $to_escape = substr($table, strlen($database) + 1);
            $prefix = $this->quoteIdentifiers("[$database].");
          }
          else {
            $to_escape = $table;
            $prefix = '';
          }
          $this->escapedTables[$table] = $prefix . preg_replace('/[^A-Za-z0-9_.]+/', '', $to_escape);
        }
        return $this->escapedTables[$table];
      }
    
Production build 0.69.0 2024