[policy] Require rsync for automatic updates in Drupal core and punt other syncers to contrib

Created on 23 January 2024, 11 months ago
Updated 27 April 2024, 8 months ago

Problem/Motivation

Package Manager (used by Automatic Updates and Project Browser) uses Composer Stager to stage Composer commands in a separate directory and when all Composer commands have completed successfully (including having been validated by e.g., TUF signature checks) then sync the changes back to the site's active codebase. See ๐Ÿ“Œ Add php-tuf/composer-stager to core dependencies and governance โ€” for experimental Automatic Updates & Project Browser modules Needs work for more details about committing Composer Stager to Drupal core.

Currently Composer Stager implements two syncers: one that uses rsync on systems that have it, and one that uses only PHP filesystem functions on systems that don't have rsync. Rsync is typically installed by default on Linux and Mac, but some server installations could not have it (e.g., slim Docker images or institutional hosting that intentionally or unintentionally opted out of installing the rsync package). Rsync is typically not installed by default on Windows, but can be installed via WSL, Cygwin, and other ways. We do not have data on what percentage of Drupal sites are on hosts without rsync.

When we initially built Composer Stager we underestimated how cumbersome the PHP syncer would become. Iterating the source directory and creating directories in the destination directory and copying files into them isn't so bad, but complexity increases when you need to deal with symlinks, ensuring permissions are set on the destination directories and files to match what they are in the source directory, and dealing with edge cases like copying files into a read-only directory (which requires making the directory temporarily writable). Rsync is an incredibly mature and robust piece of software. In hindsight, trying to emulate it in PHP was foolish.

Proposed resolution

Drupal core

Remove the PHP syncer from Composer Stager. This means Drupal core will not be responsible for supporting the "rsync not installed on the host" use case. People on such systems will either not be able to use Automatic Updates or they'll need to install stuff from contrib per below.

Contrib add-ons

We would likely not need all of these contrib solutions, see Remaining tasks. The ones we choose not to create though could be created by others in the community.

  1. Move the PHP syncer to a separate repo and make it a Windows-only syncer, thus avoiding the problems of it having to manage directory and file permissions. This helps the Windows case, where rsync is both not installed by default and where it's hardest to install (e.g., needing to deal with WSL or Cygwin or similar). This PHP syncer would be a package that isn't installed or maintained as part of Drupal core, so Windows users wanting to use it would need to composer require it just like any other contrib project or 3rd party Composer package.
  2. We could also create a (not core) package (GitHub repo) that facilitates installation of Rsync on Linux and Mac systems that don't have it. See ๐Ÿ“Œ Create a separate rsync shim Composer library for Composer Stager Active for details.
  3. We could also create a (not core) package (GitHub repo) that implements syncers using base OS commands, such as robocopy for Windows, cp for Linux, and ditto for Mac. See https://github.com/php-tuf/composer-stager/issues/322 for details. In the long run, I think these would be far simpler to maintain than the PHP syncer, but they're not written yet. Since for "Windows-only syncer" mentioned in 1) how it actually does the copying is an internal concern, the 1.x version could use the already created PHP syncer and the 2.x(or later) could be changed to use the robocopy to decrease technical debt.

Remaining tasks

  • Decide if it's okay to remove the PHP syncer from Composer Stager and make all "no rsync" use cases not core's responsibility.
  • Decide which, if any, of the Proposed Resolution's possibilities for contrib need to be in place in order to make the above decision for core. Or, is it enough that those contrib options are possible, and that we don't need any of them implemented yet? E.g., should we first get real-world feedback from people without rsync attempting to use alpha/beta releases of automatic updates in core and use that real-world data to decide which of the above contrib solutions to implement?

User interface changes

API changes

Data model changes

Release notes snippet

๐ŸŒฑ Plan
Status

Fixed

Version

11.0 ๐Ÿ”ฅ

Component
Update  โ†’

Last updated 3 days ago

  • Maintained by
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States @tedbow
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States @dww
Created by

๐Ÿ‡บ๐Ÿ‡ธUnited States effulgentsia

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @effulgentsia
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States effulgentsia
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    This seems like a very good idea to avoid maintenance burden. If it turned out lots of sites that want to use automatic updates don't have rsync installed we could worry about it then, but we won't know that's the case until we start getting reports.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    Re the "Remaining Tasks" item

    Decide which, if any, of the Proposed Resolution's possibilities for contrib need to be in place in order to make the above decision for core.

    When determining whether and to what degree we should support hosting that does not have rsync and could not easily install it we have to consider it in relation to other requirements of Automatic Updates(actually Package Manager under the hood)

    For example, if we determined that many restricted hosting setups did not have rsync but also determined that most of those hosting setups also had PHP's proc_open()(needed to invoked Composer) disabled then providing a non-rsync file syncer alternative would not actually help those users, unless we also solved the proc_open() problem for them.

    Short of getting a lot of users to try the contrib module and somehow determining those users are also the likely Core AutoUpdates audience it will be hard to determine these problem before AutoUPdates is released as an experimental module in Drupal core.

    It would be great to figure out how common rsync is but I think this is only part of the problem:
    Because one likely scenario could be:

    1. We donโ€™t know what % of users will not have rsync
    2. We do some effort to figure out what that % is
    3. We find out it is 20% :scream:
    4. We implement the 3 OS specific syncers before AutoUpdates is released in core
    5. That 20% now tries AutoUpdates in core
    6. We find out that 95% of the 20% can't use AU anyways because they also donโ€™t meet some other requirement of AutoUpdates, for example proc_open()
    7. To use Automatic updates the 20% has to move to less restricted hosting which may actually have rsync

    I personally think the best way to save ourselves from a ton of work, and the community from a long term maintenance nightmare is:

    1. Release the Core Alpha Experimental module with only rsync support
    2. (optional) also have some sort of Windows-only syncer option, maybe the already created PHP Syncer

    We know that will stop some people from being successful but it is only Alpha and at that point we can ask users to provide a screenshot of their validation errors. This will tell us what other requirements they are not meeting. If almost everyone who has rsync missing also has some other basic/unsolvable validation error because they are on restricted hosting it should give us pause about creating the OS specific syncers because it would not actually help that many people.

  • ๐Ÿ‡ฆ๐Ÿ‡บAustralia larowlan ๐Ÿ‡ฆ๐Ÿ‡บ๐Ÿ.au GMT+10

    Yes agree, maintaining a PHP version of rsync doesn't sound like fun

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States phenaproxima Massachusetts

    The PHP file syncer has been a fathomless cornucopia of buggy trouble, despite @TravisCarden's monumental efforts to tame it. I'd be happy to see it go away.

    I'm definitely worried about the loss of broader compatibility, but like @tedbow, I have also been mildly frustrated by the utter lack of data. "What set-ups should we try to support?" has largely been guesswork. So I think it makes sense to move forward in core alpha with a hard requirement on rsync, and then add greater compatibility depending on what needs are revealed by Package Manager's use in the wild.

    That said, I don't think we're going to see truly wide adoption of it until we release it stable in core. So we should probably plan for adding broader compatibility later (and I like @effulgentsia's idea of calling out to native OS-specific commands), avoiding taking on extra work and maintenance burden now.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States warped

    Just a few thoughts.
    1. Are you thinking that most sights will use AU? If not, then are the likely users less likely to be using local development to update and test, more likely to set and forget, assume that a core service will just work, and have more difficulty when it doesn't.

    2. Should new users be presented with options that just work, instead of needing to immediately jump into the Drupal contrib system. Core Drupal allows a pretty good basic website, before venturing into contrib.

    3. If cheap hosting doesn't have rsync, will they remove Drupal as a 1-click install option? I have seen cheap hosting with WP, but not Drupal. Since they are often very slow to update Drupal, this seems like a great opportunity to improve using Drupal in those circumstances.

    4. Is there no other tool or method, besides trying to duplicate rsync in PHP, that would require less maintenance?

    5. Are we improving the experience for those new to Drupal, or for those that don't even know what Drupal is, and are just trying it out of several options.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States warped

    I understand trying to keep core minimal and less stress/work to keep it maintained. Maybe if there is a default "Recipe" that can pull in all the basic contrib needed to create a workable website, one that can conditionally check hosting services, that would make entry level easier to start, while allowing many core modules to be removed to contrib.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    Re #8

    @Warped I share some of your concerns I just don't think we have to solve all these problems in the Alpha version of the experimental module. We can adapt as we get alpha testers to make ultimate core stable experience as friction free as possible.
    but I think we should be cautious about supporting more than we have to at the Alpha stage because I think we will have to support whatever we create for a while.

    1. I am not sure what percentage will use AU but because many sites will have a more complex deployment workflow they will probably not us it.

      .... more likely to set and forget, assume that a core service will just work, and have more difficulty when it doesn't.

      If AU is installed and automated/cron are updates are enabled then we give an error message on almost all admin pages(to privileged users) about any validation error that would cause AU not to work. Not meeting the requirement for rsync or whatever syncer we end up offering would fall into this category. We also would email the errors. Since sites will be relying on this for keeping the site secure from the start we have tried make sure the admin will also be notified if it is not going to work ahead of time.

      But the system requirement for is probably not something that is going to change very often. For instance if hosting offers rsync they probably will not remove it without giving very prominent warning to their customers.

    2. Should new users be presented with options that just work, instead of needing to immediately jump into the Drupal contrib system.

      Yes that is the goal. But it is hard for us to know ahead of time what capabilities hosting companies for sites that use AU will have. But luckily we have the experimental module process in core to figure this out.

      If in the process that 30% of sites won't have access to rsync then I would suspect that would need to support other syncers in core and not have those users have to go to contrib. 30% is just an example number I think it would a product decision what number would be acceptable. I don't think we could ever get to 100% as other requirements such as a writable file system(though there are options ๐Ÿ“Œ Add Symfony Console command to allow running cron updates via console and by a separate user, for defense-in-depth Fixed ) will eliminate some sites.

    3. I agree. I think we will hopefully get testers in the experimental phase using that type of hosting and try to work during the experimental process to make their experience as good as can be, which might mean supporting more than rsync in core before stable.
    4. . Is there no other tool or method, besides trying to duplicate rsync in PHP, that would require less maintenance?

      I think the "syncers using base OS commands" mentioned in the summary will be that and we should consider them as option if rsync support in core is not enough

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States warped

    At the Alpha stage I can see wanting to keep parts required in core at a minimum. Just thinking that for general release, it would be good to support beginners by reducing problems they may initially encounter.

    I agree that hosting features are not likely to be removed, but if they aren't there in the first place, are unlikely to be added by cheap hosters. I had difficulty getting PHP 8 added a year ago by a client's cheap hoster. They said they had no plans for it, and only when i asked if we would need to Seek hosting somewhere else, did they grudgingly install it in our shared environment. Only later did they install it where it was available in CPanel. And dropping Drupal ad a 1-click install sounds like the path of least resistance for them, if customers run into, and report any problems with using Drupal, that point to them needing to change hosting configuration and requirements.

  • ๐Ÿ‡ณ๐Ÿ‡ฟNew Zealand quietone

    Yes, this is a good idea to limit to rsync. I'm usually in the camp of start small and build, so this seems like the right way to proceed. Other options, if needed, can mature in contrib.

  • ๐Ÿ‡ง๐Ÿ‡ชBelgium wim leers Ghent ๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡ช๐Ÿ‡บ

    rsync is a totally reasonable requirement. It's trivial for any hosting company to install this. Reimplementing something like it, even a tiny subset, would be a colossal waste of resources.

    (I've been advocating for this for almost a year now: ๐Ÿ“Œ Warn strongly if the rsync file syncer is not in use Fixed + ๐Ÿ“Œ Use the rsync file syncer by default Fixed .)

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    We also have ๐ŸŒฑ [policy, no patch] Drop support for Windows in production Needs review open. That would not cover using windows for local development, but windows for local development doesn't necessarily imply the actual local hosting environment is windows these days either.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom longwave UK

    +1 to depending on rsync being available, at least to start with; we can always revisit this later, but if this gets AU over the line faster then this makes sense to me.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA

    Some more thoughts on Windows,

    1. @TravisCarden is now working on making the existing Rsync syncer that comes with Composer Stager work with Windows. He is not but seems like it is doable. Of course you would still have to install rsync somehow
    2. With the combo of ๐ŸŒฑ [policy, no patch] Drop support for Windows in production Needs review and โœจ Recommend DDEV as the default Drupal local development environment Active it does make me think it is not worth the effort to make Windows specific syncer
    3. I think the only case that leaves out is a Windows user who only has PHP and wants to just use the Quickstart script to tryout Drupal. I think with Quickstart Automatic Updates is probably not much use to them but in the future Project Browser might be also quickly try out some modules. I think we can deal with that when the time comes and decide at that point if it makes sense to make a Windows specific syncer, do ๐Ÿ“Œ Create a separate rsync shim Composer library for Composer Stager Active or recommend DDEV instead
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States tedbow Ithaca, NY, USA
  • Status changed to RTBC 9 months ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States effulgentsia

    Given #18, this issue is essentially "Fixed", but moving it to RTBC first for visibility. I'll mark it Fixed if/when a few days go by without anyone raising compelling objections.

  • Status changed to Fixed 9 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom longwave UK

    There has been no dissent, let's mark this as fixed; as previously stated we can always revisit this later and add alternative sync methods if it turns out rsync is a no-go in some important use cases.

  • Status changed to Needs review 9 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡นItaly falcon03

    I think we are not considering shared (cheap) hosting environments. Nowadays you already have to pay higher hosting costs due to the fact that SSH access is required if you do not want Drupal's maintenance to become a pure nightmare; the automatic-upgrades initiative should mitigate this IMHO (see wordpress automagical updates as the reference target). The whole idea of separate environments works great for enterprise use-cases, but for simpler websites is not very sustainable.

    So, are we officially saying we do not care about hamateur websites? I have nothing in contrast with that vision, but then we should optimize Drupal for that use case.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom longwave UK

    Is rsync not widely available on most hosting environments? The concern here was largely around Windows; in Linux environments, rsync is usually installed by default in many distributions.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    I'm not sure if there's a misunderstanding in #21, we're not requiring rsync access to the server for end users, agreed that would be a huge barrier.

    For this issue the requirement is server can rsync files locally between folders - which should be available on most hosting environments, no actual access required just compiled in.

    If we find out a load of cheap hosting isn't available, we'd definitely try to mitigate that, but I don't think we should try to deal with that in advance, just in case - since hosts would have to be actively disabling/excluding rsync and we don't have any indication that they are.

  • Status changed to Fixed 8 months ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    I'm moving this back to fixed given it's been more than a week without a reply to #23, I think that would be a valid concern if we were requiring rsync access to shared hosting environments, but that's not what this issue is proposing. If I've misunderstood the concern entirely, can always re-open again.

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024