SQL Server Identity error when updating to Drupal 10.1.0

Created on 29 June 2023, about 1 year ago
Updated 2 January 2024, 6 months ago

Problem/Motivation

When upgrading from Drupal 10.0.8 to 10.1.0, the following error occurred during the Drupal database update:

dblog module
Update #10101
Failed: Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[42000]: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Multiple identity columns specified for table 'watchdog'. Only one identity column per table is allowed.: ALTER TABLE [watchdog] ADD [wid] bigint IDENTITY; Array ( ) in dblog_update_10101() (line 135 of D:\inetpub\wwwroot\drupal\core\modules\dblog\dblog.install).

The fix in issue 3081144 πŸ› Database primary keys can exceed maximum integer storeable (has actually occurred, in watchdog) Fixed (included in 10.1.0) changes the [watchdog] table [wid] column from int to bigint. I traced that to the changeField function in Schema.php, which renames the column to [wid_old] and tries to create a new column [wid] as bigint, but fails because SQL Server does not allow two identity columns in a table.

It looks like issue 3263493 πŸ› Upgrading to Drupal 9.3.x changes users table and locks users out of the site Needs work may have the same underlying problem and already has a merge request with some changes. Later I'll try to see if that could fix this issue.

Steps to reproduce

Upgrade from Drupal 10.0.x to 10.1.0 running on sqlsrv driver. I am using sqlsrv 4.4.0-beta1 but I don't think the function has ever been designed to handle this situation so the driver version probably doesn't matter.

Proposed resolution

Change function changeField in sqlsrv\src\Driver\Database\sqlsrv\Schema.php to properly handle identity columns. The obvious solution would be to drop IDENTITY from the old field, but it appears that SQL Server does not allow that either, saying the table must be re-created. It did allow me to manually change the field from int to bigint without affecting the IDENTITY by running the following SQL, so the solution may be to change the field type without renaming and copying:

ALTER TABLE [dbo].[watchdog]
ALTER COLUMN [wid_old] bigint NOT NULL

Remaining tasks

Find out if this needs the same fix as issue 3263493.

If needed, write patch.

User interface changes

API changes

Data model changes

πŸ› Bug report
Status

Closed: duplicate

Version

4.4

Component

Code

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @JerryACSA
  • I tried the Schema.php file from MR 79 in issue 3263493 with the array keyword removed from createTableSql as has been done in a more recent update. The only other changes from 4.4.0-beta1 are related to that fix. I ran the database update again and got a different error. It looks like createPrimaryKey tries to insert self::COMPUTED_PK_COLUMN_NAME between the empty parentheses SQL Server complained about but for whatever reason that's empty. I'll have to look at it after the July 4 holiday.

    dblog module
    Update #10101
    Failed: Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[42000]: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near ')'.: ALTER TABLE [watchdog] ADD CONSTRAINT [watchdog_pkey] PRIMARY KEY CLUSTERED (); Array ( ) in dblog_update_10101() (line 135 of D:\inetpub\wwwroot\drupal\core\modules\dblog\dblog.install).
  • πŸ‡¬πŸ‡§United Kingdom pstewart

    Thanks for reporting this Jerry. I think you're right to link this to #3263493 πŸ› Upgrading to Drupal 9.3.x changes users table and locks users out of the site Needs work ; in that issue the problem arises from a change to the users table where the uid column was changed from a plain integer column to an autoincrementing serial column, while in this one the change is alteration of the type of an existing column, but in both cases the root cause is the same: changeField conducts the change by creating a new field with the new spec, copying the data over from the old field to the new field, and then dropping the old field. Whether we're coming from an existing identity field or a non-identity field we get the same problem: to create a new identity field we need to recreate the table.

    In the watchdog case where we're changing the type of the field another option is possible: with SQL Server 2016 or newer:

    ALTER TABLE watchdog ALTER COLUMN wid bigint WITH(ONLINE=ON);
    

    There are downsides to this though: it can potentially be quite slow, and doesn't solve the index problem either (so doesn't help in most use cases where the identity column is also the PK).

    On the other hand, I can't see the table recreation method being any faster. There's also potential for data loss with a recreated table unless we can guarantee no inserts can happen on the original table between us copying the data over and renaming the old table to take over from the new one. There is a strategy to change a column from int to bigint without downtime where we use triggers to ensure we're catching changes to the original table while we're copying data over.

    I will try and spend some time on this later this week and see what changes we need to make to the table recreation approach in MR79.

  • Issue was unassigned.
  • In these cases of running update scripts after an update, no one else should be accessing the database so any of these methods should be fine and I don't think we need to worry about concurrent access or data loss. However, it's a driver for generic database commands and I don't know if other functions or modules might change an identity field type while other users are actively using the table. It doesn't seem very safe to do that, so maybe it isn't done.

    I'll see what you come up with regarding MR79.

    (Corrected my upgrade/update terminology to be in line with documentation)

  • In these cases of running update scripts after an update, no one else should be accessing the database so any of these methods should be fine and I don't think we need to worry about concurrent access or data loss.

    I disagree. When using a CI/CD script to deploy, maintenance mode is not necessarily used. The site is up continuously during deployment.

  • Also, is there a workaround if I want to upgrade before a fix is committed?

  • Yes, I used a workaround. During the update to 10.1 or higher and before running update.php, manually convert the field using the following SQL (in SQL Server Management Studio or another option):

    ALTER TABLE watchdog ALTER COLUMN wid bigint NOT NULL
    

    You can optionally add WITH(ONLINE=ON) to the end on SQL Server 2016 or newer if you want to use that option. I didn't use it.

    Then, prevent the update function in core\modules\dblog\dblog.install from attempting the change, while allowing it to be marked as complete. Right after the following line (125 in my current copy),

    function dblog_update_10101(&$sandbox = NULL) {
    

    I added:

    return;
    

    Then run update.php and it should be successful. You have manually completed the changes just as if the driver could handle it. No future consideration needs to be given as a result of using the workaround other than the possibility that a change to an identity field in another table could cause the same type of issue if this hasn't been fixed yet.

    I'm not familiar with 9.5 but unless it has the watchdog table upgraded to a bigint type wid column, the issue will appear on an upgrade to 10.1. It's not specifically related to 10.0. I don't see any mention of backporting the 3081144 issue to Drupal 9. If it was, this issue would probably occur during any update that included it on SQL Server. A clean install of 10.1 or higher should work fine since it doesn't need to change an existing table.

  • Is there any way I can help get this made programmatic and automated using an update function? I'm assuming that's what we need to do to fix this issue for everybody. Here are some ideas:

    ALTER TABLE watchdog ALTER COLUMN wid bigint NOT NULL

    This can be performed using a programmatic SQL query.

    Then, prevent the update function in core\modules\dblog\dblog.install from attempting the change, while allowing it to be marked as complete.

    If you look at the dblog_update_10101 function, it calls $schema->changeField(), which will call the changeField function in this module.

    From what I see looking at this function, it seems that this is the cause of the issue in the first place, as it says this

         * IMPORTANT NOTE: To maintain database portability, you have to explicitly
         * recreate all indices and primary keys that are using the changed field.
    

    However, if we have already updated the field, then as you said, it's not necessary. Is there a way to check the value of $spec in this function, and verify that it's actually changing the field and that the spec isn't the same as what was there before?

    Or is there a way to modify this function so that altering a primary key doesn't have any issues like this in the first place?

  • I don't think a manual workaround will be feasible for me if I want to get to Drupal 10.1 in production, because my deploy process is automated, so updates will be run automatically. I need to have a programmatic hook_update_N type of solution.

  • πŸ‡ΊπŸ‡ΈUnited States KenKodz

    When I submitted the changes to create a new column, migrate the data, then update the indexes, I did it with the idea that this is going to keep happening. Now, I am not 100% on my code changes working for every scenario, but it worked for me when updating to 9.3.x. Whatever is done here needs to work for any DB column changes. I'd be happy to help if given some direction on how the code I wrote for 9.3.x needs to be modified to get this working.

  • I agree that column changes are likely to continue happening in the future. However, I don't know enough about SQL from a database management standpoint to know if it's always feasible to use ALTER statements (the solution/workaround suggested in comment #3), or if that should be done for just certain cases.

  • From what I understand so far, the problem with the way it currently works is with the primary key. Adding another primary key is what causes the problem. Would it be possible to add the new column with it not being a primary key, and then once data is copied over, switch which column is the primary key and remove the old column?

  • Also, thanks for your willingness to help. As far as I know, this issue is the only thing preventing me from converting sites to Drupal 10 right now.

  • It's possible to drop and add a primary key on an existing table, but the IDENTITY status (automatic numbering) of a column cannot be changed and only one is allowed per table so switching columns is a problem (although there is a more complex way to change IDENTITY status using ALTER TABLE...SWITCH described here: https://stackoverflow.com/questions/1049210/adding-an-identity-to-an-exi... ). IDENTITY can be changed when copying to a new table, but in this case the existing logic that creates a new column, deletes the old one and renames, would be overkill if copying to a new table and and prevent it from working.

    KenKods' solution (MR 79 mentioned above) should work for this issue too, which creates a new table using the new data type for the column (bigint in this case) and no temporary columns, turns on SET IDENTITY_INSERT, and converts from int to bigint while copying all the data from the old table to the new one. Then it adds indexes and renames the tables without making any further changes to the columns. For some reason I got the error in comment #2 when I tried to use it.

    KenKods, does the error in #2 mean anything to you? I don't know whether self::COMPUTED_PK_COLUMN_NAME shouldn't be blank, or if it should but isn't handled properly. That code isn't part of this change but didn't like something when trying to run this update. I used Schema.php from MR 79 with a few changes to get it up-to-date with the sqlsrv driver since the MR was written, and attached it here. It should be current as of 4.4.0-beta2. I don't have tools to use or create patch files so I've patched files manually. Thanks for looking at this and for your MR!

    Anyone, since the changeField function starts a transaction before it makes changes, does that address the concerns in #3 and #5 about data loss if other changes are being made during the update?

  • A transaction would not resolve the concerns, because that's mostly for if something goes wrong, then the progress can be reverted.

    Comment #3 has a link to a solution for avoiding downtime. Without something like that, it's possible that data gets lost during the move to a new table if an INSERT into the old table happens, unless you lock the old table, then maybe it would be fine? I'm not sure.

    These aren't exactly the same thing, but they show locking a table to prevent INSERTs, which is what you'd need to do to the old table while the new table is being created and populated with data:

    https://stackoverflow.com/questions/30857355/locking-a-sql-server-table-...
    https://dba.stackexchange.com/questions/297544/how-to-lock-table-in-sql-...

  • πŸ‡¬πŸ‡§United Kingdom pstewart

    Thinking out loud here: what happens to a query blocked by a table lock when we eventually drop the locked table in favour of the replacement table?

    (Also apologies for being AWOL - I've had to pivot to other projects and have had very little time to spend on Drupal for the last few months, I'll hopefully be able get some more Drupal time in before the end of the year as I still need to finish my D10 upgrade!)

  • Hmm. I think it would still work, because the new table will have the same name? Maybe we need a lock on the new table as well, prior to dropping the old one and renaming the new? However, I am not a person who knows enough to give any more than a guess here. This answer suggests that having the entire drop and rename operation within a transaction would work, as it would get committed all at once.

    An aside, since ALTER TABLE watchdog ALTER COLUMN wid bigint NOT NULL has been shown to work, is it possible to detect if ALTER can be used for all the necessary changes, and use that if possible? Then, recreating a table won't need to happen as often.

  • In regards to #7 πŸ› SQL Server Identity error when updating to Drupal 10.1.0 RTBC ,

    I don't think you have to modify the code during the workaround if you can avoid running programmatically. The function to avoid running is dblog_update_10101. So we just need to run the prior dblog hook_update_N (dblog_update_10100), and then set the schema to 10101 as if both were run.

    To see what I'm talking about, run

    drush sqlq "SELECT name, value FROM key_value WHERE collection = 'system.schema';"
    

    This should be a complete workaround from CLI:

    First, use composer to update to Drupal 10.1. Then, DO NOT RUN the database updates right away. NOTE: This code has not been fully tested. Take a backup first and run at your own risk.

    Run this:

    # Check the current dblog schema version.
    drush eval "\$update_registry = \Drupal::service('update.update_hook_registry'); \$version = \$update_registry->getInstalledVersion('
    dblog'); if (\$version < 8600 || \$version >= 10100) { die('Schema for dblog must be > 8600 and < 10100 to run this code.'); }" && \
    # Run the first dblog update.
    drush eval "dblog_update_10100(NULL);" && \
    # Run the replacement for dblog_update_10101.
    drush sqlq "ALTER TABLE watchdog ALTER COLUMN wid bigint NOT NULL" && \
    # Set the schema to skip dblog_update_10101.
    drush eval "\$update_registry = \Drupal::service('update.update_hook_registry'); \$update_registry->setInstalledVersion('dblog', 10101);"
    

    (The dollar signs must be escaped in the commands using backslashes to avoid Bash trying to put its variables into the double-quoted string.)

    Then, you can run the database updates with drush updb.

    More info about UpdateHookRegistry:

  • As a new AHA moment for me, I'm only using this module to access another external database, not for Drupal's DB. So it's not actually hindering me from going to Drupal 10. It was quite confusing for me with the ALTER statement never succeeding.

  • πŸ‡¬πŸ‡§United Kingdom pstewart

    I've rebased MR79 to 4.4.x, and opened MR87. I'm not really sure how to link that up to the drupal.org issues, but will be working on it in the corresponding rebased issue branch.

  • Status changed to Needs review 7 months ago
  • πŸ‡¬πŸ‡§United Kingdom pstewart

    After I rebased to 4.4.x I adopted the following testing regime:

    1. Composer install fresh instance of D10.0.x and complete site installation process to a new database. Confirm 10.0.x version of schema in use (i.e. dbo.watchdog.wid has type int).
    2. Update composer.json to target 10.1.x, perform composer update
    3. Perform drush updatedb, ensure all updates complete successfully.
    4. Verify 10.1.x version of schema is now in use (i.e. dbo.watchdog.wid now has type bigint)

    I did my initial test of MR87 after the 4.4.x rebase last week and it worked without issue. I've made a small change to rename the originial table at the start of the database transaction to ensure we get the schema lock on the table before doing the data copy, which I think should be enough to prevent any problems from other site activity making changes on the table in parallel with the update. I've tested that change this morning and it's worked fine: it's behaving exactly as expected (although worth noting that in my testing regime I don't have any other traffic going on when I perform drush updatedb)

    In more general terms I'm happy that the existing work the MR does to recreate the table, copy the data, and recreate the PK and indices is working as expected in this scenario - the code looks good! The attached file shows the before and after of the watchdog table, and we can see that the wid column is recreated as the last column in the table with the bigint type, and the key and indices all match up. Not shown is the data, but I checked that too - the recreated rows in the table
    all look OK.

    If someone with a more real-world setup can give it a try that would be great - if we can get at least one RTBC vote I'm happy to commit this and make a 4.4.x-rc1 release.

  • Updating IS to use formatted links to issues.

  • Status changed to RTBC 7 months ago
  • πŸ‡ΊπŸ‡ΈUnited States glassb

    I was able to apply MR87 to one of our test sites and everything seems to be working as expected. I can confirm the wid column was successfully changed from int to bigint in the database and all of the data seems to have been copied over correctly.

    I also tested how the table lock would handle other traffic during the update by:

    1. Adding a 5 minute delay after the schema lock with $this->connection->queryDirect("WAITFOR DELAY '00:05:00'");.
    2. Running drush updatedb.
    3. Performing operations on the site that would generate logs.
    4. Running a select query in SSMS.
    5. Checking resulting logs after updates were complete.

    I found that while the database updates were running, the site would continually load on any requests that would generate logs. Afterwards, all of the logs generated during that update were shown in the new table. I was also unable to query the table during the updates. This suggests to me that the table lock is functioning as expected and preventing any inserts until after the update is complete.

    Everything seems to be functioning correctly so I believe this is ready to be marked RTBC. Hopefully we can get a release candidate out soon. Thanks to everyone who worked on this issue!

  • Status changed to Needs work 7 months ago
  • I reviewed MR87 and pointed out some things that should be fixed first.

  • Actually, should we then close this issue as a duplicate of πŸ› Upgrading to Drupal 9.3.x changes users table and locks users out of the site Needs work ?

  • Status changed to RTBC 7 months ago
  • πŸ‡¬πŸ‡§United Kingdom pstewart

    @glassb Thank you for doing this testing, it confirms we're getting exactly the behaviour we're aiming for (competing processes blocked during the update but then able to complete there inserts on the recreated table as soon as the transaction is done).

    @solideogloria Thanks for the code review, I've now done the corresponding fixes and run phpcs on the file and picked up a few others too (the only two coding standards violations left are on comments for getPrefixInfo which I introduced on an unrelated issue, so I'll do a separate commit to fix those after MR87 is merged).

    Given that the MR is associated with πŸ› Upgrading to Drupal 9.3.x changes users table and locks users out of the site Needs work I think it does make sense to commit the merge there and mark this issue as duplicate when we close it. I'll make sure everyone who has contributed on this issue is credited on 3263493 when we commit.

    The lint fixes I've committed are trivial with respect to code changes (purely spacing and comment changes), so I'm happy to merge and move forwards with making a 4.4.0-rc1 release without any further testing. I don't have time to do that today so will pick that up in the week).

  • πŸ‡¬πŸ‡§United Kingdom pstewart

    Noting that MR87 is now committed on πŸ› Upgrading to Drupal 9.3.x changes users table and locks users out of the site Needs work , so this issue will be considered as resolved in the rc1 release.

  • Status changed to Closed: duplicate 6 months ago
  • πŸ‡¬πŸ‡§United Kingdom pstewart

    πŸ› Upgrading to Drupal 9.3.x changes users table and locks users out of the site Needs work now fixed and released in 4.4.0.

Production build 0.69.0 2024