- Issue created by @monaw
- 🇳🇱Netherlands megachriz
- Active checkbox: you can configure feed types to import sources regularly. This is called "Periodic import" (see image). Only feeds that are active will be used for periodic import. So when unactivating it, this particular feed will no longer be imported regularly, only when you click "Import" or "Import in background".
- When an import for a feed starts, the feed gets locked. This is to prevent running another import for it. If two imports for the same feed would run at the same time, that could cause issues, for example it could make earlier imported items being deleted that should not be deleted (if you have configured to delete previously imported items that are no longer in the feed). Unlocking the feed you would do if you believe the import got stuck. You can then restart the import. When unlocking, Feeds cleanups the metadata for the import that did not finish.
- If you want to start the import over, first unlock the feed. Then you can delete all imported items.
- "Delete items" only deletes items that are created or updated with this feed. It doesn't delete items from other feeds - if those items are not created or updated with this feed. It is technically possible to configure feed types so that two feeds update the same content. Say feed 1 creates items A, B and C and feed 2 creates items D and E and updates item C. Then "Delete items" on feed 1 would delete items A, B and C and "Delete items" on feed 2 would delete items C, D and E. So in this case they would both delete item C, because they both "touched" item C.
A tip for importing large files: I think it is a better idea to import these in background (by using the "Import in background" button). This way, the import runs in chunks during cron runs. This way the chance that the import will hang is smaller, because it doesn't depend on the browser being kept open. It can still hang or get stuck however. For example when a fatal PHP error occurs in the process, or when the server shuts down. Or perhaps when running module updates (because that could cause module files temporary getting removed and that could possibly cause fatal PHP errors too).
Import in background does require cron to be configured. Per cron run, the import process runs for about a minute. So I can imagine 324000 lines would take quite a large number of cron runs too.I hope this answers your questions. Feel free to reopen this issue if you have more questions. :)
- Active checkbox: you can configure feed types to import sources regularly. This is called "Periodic import" (see image). Only feeds that are active will be used for periodic import. So when unactivating it, this particular feed will no longer be imported regularly, only when you click "Import" or "Import in background".
- 🇳🇱Netherlands megachriz
Ah, I see you just updated the issue summary. Feel free to add/update the documentation → . :)
- 🇳🇱Netherlands megachriz
The "Delete" button on a feed, deletes the feed itself. Not the import items.
thank you @megachriz for your helpful info! few more questions:
- what do you recommend as the best way to import 324,000 rows of CSV data?
- why does it hang during import and delete? i'm monitoring the system and nothing else is happening and nobody is logged in so why would it hang?
- if the import hangs, does unlocking the feed and then restarting will start from the beginning again or from it left off?
- if i configure the feed to import in the background and my cron is set to every hour, that should work right? i guess i'm afraid it will still hang...and if that happens, how can i kill the cron import?
- 🇳🇱Netherlands megachriz
- I don't have experience importing a CSV file with that many lines, but I would choose to import in background and let the import be done in chunks using cron.
- I don't know. There could be a bug (either in Feeds or an other module) causing the import to stop. If this is the case, you should be able to find something about it on the server logs (the error may not be logged by Drupal). An other possibility is that the server thinks "This process runs for way too long, I'm going to stop it". Or a lack of memory. Probably these are reported on the server logs too.
- If you unlock the feed and then restart the import, the import will start from the beginning. If you have configured a CSV column as unique, Feeds would skip items it already has imported, but it will still go through each item in the file in order to check that. Say the import hangs at 2000 items, and you restart the import, then for the first 2000 items of the CSV file Feeds will check if they are already imported and would see that this is the case (and not import them again). But just doing these checks can also take a long time.
- Yes, setting cron running once an hour would work, but I would try to run cron more often. If for example Feeds would manage to import 500 items per cron run, then it would take 648 hours to import all 324000 items. That's almost a month. If you want to stop the cron import, then you would unlock the feed.
- An import running in the UI (where you see a progress bar) stops shortly after you close the browser (it would just finish only the last chunk it was busy with). An import running on cron does not depend on the browser. You can restart the import by unlocking the feed and then start the import again.
- Yes, an import running on cron does not depend on the browser.
Since I don't know what makes the import hang, I cannot guarantee that the import of all 324000 items will be successful when imported during cron. If it happens to hang, I would first check the server logs to see if there's any information what made it hang. Then I would check how many items were already imported and remove that many items from the CSV file and try again.
You could see that the import hangs if the amount of imported items stays the same after a cron run.By installing the Queue UI → module, you can inspect/monitor the import tasks that are scheduled to run. If the same task is retried over and over again, then the import hangs. (One day I hope to add functionality to Feeds that would detect that the same task is retried over and over again, so that it can warn the user that something went wrong during the import - or maybe even skip the task so it can continue doing the rest of the import.)
@megachriz, thank again for your helpful info! one last question...if i don't have any unique columns, when the import hangs, is it best then to unlock the import, delete the imported items, and restart the import again?
- 🇳🇱Netherlands megachriz
If you can figure out why it hangs and then resolve that issue, then you could start the import from scratch. But if it is going to hang once and you don't get that issue resolved, it will likely hang again on a restart of the import. So in that situation I would choose to remove the items that were already imported from the CSV file.
ok, now the files with 10,000 entries seem to be importing without hanging after I uncheck the "Active" option when setting up the feed AND I leave my MacBook plugged in! I checked my MacBook battery setting and turned off Low Power Mode option (see attached image) but that alone didn't seem to fix the hanging...so looks like it might be a combination of feed setting and my computer...interesting!
- 🇳🇱Netherlands megachriz
That's interesting that a MacBook going in sleep mode could affect the import from hanging. For imports in the UI I think that makes sense (if the webserver is running on the MacBook), but I would expect that with imports on cron, the import would eventually continue.
I have a Mac too. Would be interesting to run a large import and then at the exact moment that cron is running, put the Mac into sleep mode. And then see if that makes the import hang.
Automatically closed - issue fixed for 2 weeks with no activity.