- Issue created by @ressa
- 🇩🇰Denmark Steven Snedker
This would also benefit the External Link Preview Module immensely.
Yet with all the awful specious invoicing companies out there, local caching could mean a lot of Drupal sites losing a lot of money and a lot of sleep. We're at an impasse at Make a local image copy for GDPR 📌 Make a local image copy for GDPR Active (read and shudder).
Wayback Filter could branch out and support way smaller archive sites like Archive.today or Ghost Archive. There may be a few other OKish candidates on the List of web archiving initiatives. Autosubmit the URL where feasible.
But with only 14 users → , half of them me, and archive.org in ok health, I haven't spent any time on it. With only one user ever (!), Wayback Submit to Archive.org → will never be updated.
But back to the orignal question: caching sites?
No. Having Drupal sites caching (storing and publishing) third party sites locally is sadly way too risky. - 🇩🇰Denmark ressa Copenhagen
Great that there are alternatives, though https://archive.today/ seems to be gone and https://ghostarchive.org/ seems somewhat lacking in content ...
I didn't think about the copyright angle ... and that deep link story is both crazy and frightening, at the same time.
A strategy, instead of serving the pages right now, could be to save a copy of all the captures into a private
web/sites/default/files/wayback_captures
folder. The capture copies should simply sit there and wait, ready for the day archive.org vanishes.When that happens, the individual site owners can decide whether to resurrect the links, by using their own local copies of the captures, or not.
But I do understand your decision: All seems well now (until a day when archive.org is suddenly gone ...) but most importantly, the resources or even desire, to build such a feature, may not be there. I thought about setting to Postponed (in case someone else wanted to run with it) but in case you prefer a "clean" issue queue, I am closing it :)