Migration to Acquia DAM is too slow to complete and fails with 502 from Widen

Created on 9 May 2025, 7 days ago

Problem/Motivation

Our site has over 27,500 assets synced from Acquia DAM. When running the migration from Media: Acquia DAM to Acquia DAM via Drush, each item takes about a second to queue for migration. If the process were able to complete without timing out, it would take over 7.5 hours.

However, instead of completing, it fails with the following error after less than 500 items:

Server error: `GET https://api.widencollective.com/v2/assets/.../versions?expand=asset_properties,embeds,file_properties,metadata,metadata_info,metadata_vocabulary,security,thumbnails` resulted in a `502 Bad Gateway` response:
> Bad Gateway

Steps to reproduce

Attempt to migrate a site with a large number of assets.

Proposed resolution

Fix the server issue that is generating a 502 error (possible rate limit?)
Improve the performance of migrating items.
Queue the items for migration in batches.

Remaining tasks

User interface changes

API changes

Data model changes

🐛 Bug report
Status

Active

Version

2.1

Component

Code

Created by

🇺🇸United States byrond

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @byrond
  • 🇺🇸United States byrond

    It looks like it's making a separate API call for every media entity that needs to be migrated? Could we just patch the migration process to skip queueing the items and then run a full sync after the migration completes? It's my understanding that the new connector requires this sync to make sure the version_id and external_id are available: https://git.drupalcode.org/project/media_acquiadam/-/blob/2.x/src/Batch/...

  • 🇺🇸United States byrond

    The "Bad Gateway" response is happening while saving the entity, so skipping queueing won't likely change anything.

    >  [error]  TypeError: Drupal\acquia_dam\Client\AcquiaDamClient::getAsset(): Argument #2 ($version_id) must be of type string, null given, called in /mnt/www/html/dhsinternetode10/docroot/modules/contrib/acquia_dam/src/Plugin/media/Source/Asset.php on line 234 in Drupal\acquia_dam\Client\AcquiaDamClient->getAsset() (line 170 of /mnt/www/html/dhsinternetode10/docroot/modules/contrib/acquia_dam/src/Client/AcquiaDamClient.php) #0 /mnt/www/html/dhsinternetode10/docroot/modules/contrib/acquia_dam/src/Plugin/media/Source/Asset.php(234): Drupal\acquia_dam\Client\AcquiaDamClient->getAsset()
    > #1 /mnt/www/html/dhsinternetode10/docroot/core/modules/media/src/Entity/Media.php(438): Drupal\acquia_dam\Plugin\media\Source\Asset->getMetadata()
    > #2 /mnt/www/html/dhsinternetode10/docroot/core/modules/media/src/MediaStorage.php(27): Drupal\media\Entity\Media->prepareSave()
    > #3 /mnt/www/html/dhsinternetode10/docroot/core/lib/Drupal/Core/Entity/EntityBase.php(354): Drupal\media\MediaStorage->save()
    > #4 /mnt/www/html/dhsinternetode10/docroot/modules/contrib/media_acquiadam/src/Batch/MediaTypeProcessBatch.php(475): Drupal\Core\Entity\EntityBase->save()
    > #5 /mnt/www/html/dhsinternetode10/docroot/modules/contrib/media_acquiadam/src/Batch/MediaTypeProcessBatch.php(74): Drupal\media_acquiadam\Batch\MediaTypeProcessBatch::updateMediaItems()
    
  • 🇺🇸United States byrond

    Initially, we were migrating some types using the "embed" method. When we changed them to "sync" (the only method available and used by the old module), we stopped getting the Bad Gateway errors.

    We did start seeing these errors related to the purger:

    >  [error]  Drupal\purge\Plugin\Purge\Invalidation\Exception\TypeUnsupportedException exception during file invalidation. File id: 232251 . File url: public://acquia_dam_thumbnails/0968372d-a84a-45a5-9a90-c6d3f5cfe4da/1a3ac635-a96f-48a6-87e3-ef0c6faab7ad.png. Error: wildcardurl 
    

    It's possible this is due to our testing on an ODE (development environment) where not all of the files exist.

  • 🇮🇳India chandu7929 Pune

    There's separate issue for comment #3

  • 🇺🇸United States byrond

    I suspect the version_id is being returned as null from getFinalizedVersion() because the API call is failing due to the Bad Gateway response.

    From the method:
    "Cannot get the version list from the API for asset of ID %asset_id. The error is: %message"

    Adding the patch you mentioned may allow the processing to continue, but I don't think it would really be successful.

    The big question here is: Why is a separate API call being made for every item that needs to be migrated, and can it be done more efficiently to allow customers with large numbers of assets to complete a migration successfully?

  • 🇮🇳India vipin.mittal18 Greater Noida

    Hello Byron,
    While Media Acquia DAM does not retain asset versions and external IDs, Acquia DAM does. As a result, a cron job has been scheduled to call the API, retrieve the version ID and external ID, and store them in the table.

    Could you please apply the solution suggested by Chandan and confirm if there are any errors preventing you from continuing with the migration?

  • 🇺🇸United States byrond

    Thanks. I have applied the update, refreshed the database from Prod, and started the migration. I was getting errors from Purge, so I disabled it during testing and just assume we can clear all caches manually after the process completes.

    So far, I haven't seen any errors. However, my SSH session timed out after 4 hours during the remote Drush command. The command is still running on the remote environment, as I see items being queued in the log. So far, it has queued about 8500 out of 27,500 items after 13 hours. I'm not sure if Acquia will kill that process after a certain amount of time, but at that rate, it will take over 39 hours just to queue the items for migration. I suspect it will take days to finish processing the queue.

    Is there any way to optimize this? The module documentation recommends against using both Media: Acquia DAM and Acquia DAM at the same time. However, it will be inevitable with this a migration this slow that they will both are active for a long period of time. What recommendations do you have for managing that? Disable the sync process for one or both modules? Anything else? What did you find in your pre-release testing with a large (but I would think fairly common) number of assets?

Production build 0.71.5 2024