A few months ago, on my project with 120 or so migrations (some with several hundred thousand rows to migrate), I was looking at speeding up the throughput by running migrations in parallel, respecting dependencies. I wrote a drush command (submitted at
β¨
Generate graphs of migration dependencies
Active
) to graph dependencies to help people work out what can run in parallel with what. Shortly after that, it occurred to me that we don't need people to figure that out. Thus was born:
$ drush help mimp
Run all migrations with maximum parallelism.
Examples:
migrate:import-parallel --max-processes=5 --status-frequency=5 Runs up to 5 migrations in parallel as soon as their dependencies are fulfilled, with status of running migrations
output every 5 minutes.
Options:
--max-processes[=MAX-PROCESSES] Maximum number of processes to run at once.
--status-frequency[=STATUS-FREQUENCY] Number of minutes between status updates.
Aliases: mimp
It's been working pretty well for us, so I'm (belatedly) posting it here.
Ideally, I think rather than a separate command it should be an optional feature of drush migrate-import
- if --max-processes
is present and greater than one, spawn off the subprocesses, passing on any other options to the subprocesses (adding --skip-progress-bar
, because it'll be really messy otherwise).
I don't have time now to write such a patch (maybe April? - anyone who wants to tackle it before then is welcome to), but in the meantime some people might find this useful.