Deleting filter formats may result in data loss

Created on 23 July 2025, 25 days ago

Problem

While there is no UI to delete a filter format, the deletion of a filter format via API, e.g. in update functions, may result in data loss. This is unexpected behavior and can easily happen to unintentional data loss.

This is related to 🐛 Deleting a Media view mode should not result in the deletion of an entire text format Needs work what can easily trigger this problem (and this is how I found out).

Steps to reproduce

* Create content type with body field
* Edit the field. Select filter format "Restricted HTML" under allowed_formats and save it.
* Create a node with content for the body field.
* Execute PHP code

\Drupal\filter\Entity\FilterFormat::load('restricted_html')->delete();

* Check the node, the body field and its content is gone.

Reason is that the field storage has a config dependency on the filter format, when it is configured as allowed-format. Thus, when the filter format is deleted, the field and field storage is deleted as well, even though it has data.

Proposed resolution

Deleting a filter format is problematic generally, but generally the field data is not lost when wrongly deleted. The problematic part is when the field is deleted. Thus, this is where we need to be more careful:

Proposal: Deny deletion of field storage via the config dependency chain when it has some data. (Not sure this is easily doable though)
This would solve the issue for potential other cases where field storage depends on some config that becomes deleted.

📌 Task
Status

Active

Version

11.2 🔥

Component

filter.module

Created by

🇦🇹Austria fago Vienna

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @fago
  • I think this can be done by implementing onDependencyRemoval(array $dependencies) in Drupal\field\Entity\FieldStorageConfig.

    Basically, something like:

      public function onDependencyRemoval(array $dependencies) {
        $changed = parent::onDependencyRemoval($dependencies);
        if ($changed) {
          return TRUE;
        }
    
        if (!isset($dependencies['config'])) {
          return FALSE;
        }
        
        foreach ($dependencies['config'] as $config) {
           // If $config is a filter format, return TRUE.
        }
      }
    
  • 🇦🇹Austria fago Vienna

    With onDependencyRemoval() we could remove the dependency and let the field and data survive, what I think would be acceptable given the deletion.

    I'd prefer to deny the deletion when the field has data though, but that does not seem to be properly doable. The only documented exception for ConfigEntity:delete() is EntityStorageException what seems not suiting.

    The code generating the dependency on filter-formats in text-fields is that method:

      /**
       * {@inheritdoc}
       */
      public static function calculateDependencies(FieldDefinitionInterface $field_definition) {
        // Add explicitly allowed formats as config dependencies.
        $format_dependencies = [];
        $dependencies = parent::calculateDependencies($field_definition);
        if (!is_null($field_definition->getSetting('allowed_formats'))) {
          $format_dependencies = array_map(function (string $format_id) {
            return 'filter.format.' . $format_id;
          }, $field_definition->getSetting('allowed_formats'));
        }
        $config = $dependencies['config'] ?? [];
        $dependencies['config'] = array_merge($config, $format_dependencies);
        return $dependencies;
      }
    

    As seen, it only adds dependencies for the "allowed_formats". This seems bogus, since there is no larger harm when a filter-format this is a allowed-format is removed than when a filter-format that is not referenced is removed. When all filter-formats are allowed, no dependency is added, the filter format might be used, and it can be deleted. What happens is that the field and its data is not removed, it just gets a reference on a selected filter format.
    On the other hand when a filter format is removed that is part of the "allowed-formats", since there is just one entry less that users can select. The more severe data integrity issue is the same as before.

    That said, the right fix here is probably to add a config dependency on all filter-formats when the list of filter-formats is not restricted.

    Then, given that we already have the issue that filter formats leads to data with invalid text-format-references when no "allowed_formats" are set, implementing onDependencyRemoval() to avoid field-data deletion in case of "allowed_formats" usage seems reasonable.

  • @godotislate opened merge request.
  • My suggestion to implement onDependencyRemoval() in #2 was in the wrong place. TextItemBase makes more sense on where to implement it, and that's what is in MR 12823.

    This prevents the data from deleted, but there is a display issue, in that when the format is deleted, existing content with text fields using that filter format can not display the content. However, this can be addressed by editing the entity and setting the filter format to something else.

    Leaving at NW for tests.

  • 🇦🇹Austria fago Vienna

    Thank you. I added the test case, verified it fails first, and then put it on top of your changes, to see it pass! All good, this seems ready then!

  • 🇦🇹Austria fago Vienna

    This actually concerns text fields provided by text module.

  • 🇸🇮Slovenia useernamee Ljubljana
  • 🇺🇸United States xjm

    I haven't read the solution in full yet, but it is extremely important that we block deletion of the filter format when stuff still uses it rather than anything else like altering config, since changing a filter format can have security implications.

    Whatever solution should also have subsystem and/or framework manager review.

  • So FilterFormatAccessControlHandler::checkAccess() has this:

        // We do not allow filter formats to be deleted through the UI, because that
        // would render any content that uses them unusable.
        if ($operation == 'delete') {
          return AccessResult::forbidden();
        }
    

    The reproduction steps in the IS use drush \Drupal\filter\Entity\FilterFormat::load('restricted_html')->delete();, which I guess bypasses checkAccess somehow. I haven't looked into why, and there's a note in the comment about deletion via UI and not CLI.

    Regardless, maybe a better approach is explore denying filter format entity delete access via CLI like this as well?

  • 🇺🇸United States xjm

    This prevents the data from deleted, but there is a display issue, in that when the format is deleted, existing content with text fields using that filter format can not display the content. However, this can be addressed by editing the entity and setting the filter format to something else.

    This is also kinda problematic, replacing data loss with data integrity problems and/or "perceived" data loss.

  • 🇺🇸United States xjm

    Oops, xpost. Reviewing #11 now.

  • 🇺🇸United States xjm

    #11 does indeed also sound like something we should confirm or revisit. It might be worth a git log -L on the comment to see why the issue that added it made the choice to differentiate the UI versus API deletion in this way. (There might be useful discussion on the original issue.)

    Thanks @godotislate!

    This will also eventually need usability review on whatever proposed solution we fix, and should be fixed in a way that's consistent with these other "config dependencies eat my config" issues.

    Also, bumping to critical; nothing should be able to delete a storage in this way without an explicit user decision. When we uninstall a module, for example, the user has to do a bulk operation to remove module content before they can uninstall it.

  • @xjm I think I got confused in #11.

    Looking at the code again now, it doesn't look like calling $entity->delete() invokes entity access. I believe the access check needs to happen before delete() call. Once delete() is called, I don't think there's any way to stop that except by throwing an exception.

  • 🇺🇸United States xjm

    @godotislate Ah, yeah, the internal API call should probably only do what it says on the tin and not vary according to the logged-in user account or anything like that. So changing things earlier in the callstack seems like the right approach. We might not be able to stop people from making bad out-of-context direct API calls with Drush; that is a Drush "feature" I guess.

    In either case, we should manually test it carefully.

  • 🇬🇧United Kingdom longwave UK

    Re #6/#12 this is a security feature, because the filter format may previously have sanitised some HTML, and once that is removed we can no longer safely display the content. I believe this is also why deleting a format from the UI is not allowed either.

  • I guess the question here is, given that filter formats are not intended to be deleted, what handling should there be in the case that a drush command, or contrib/custom code was not sufficiently careful to prevent deletion?

  • 🇦🇹Austria fago Vienna

    Fixing it on entity-access layer makes sense, but it's not solving the issue I reported here. The UI already has no way to delete filter formats, it can only be disabled. This issue is about unexpected behavior when you run the code, with drush, or via an (post)-update hook/function, that does not matter:

    \Drupal\filter\Entity\FilterFormat::load('restricted_html')->delete();
    

    So when the filter format is still referenced in field config, those fields get deleted, including the content. I'd argue this is very un-intuitive behavior, since this is not visible or clear from the code executed.
    Indeed, deleting the filter-format might result in data-integrity issues. But I'd argue that's an intuitive. possible consequence from deleting a filter format manually like this, but loosing not touched field-configuration including the deletion of if its data is not.

    Additionally, you can already delete filter formats that are in use with code snippets like this - so that "data integrity" issue when deleting filter formats via code is already pre-existing. The described problem only applies if the format is referenced as a "allowed-filter-format" in the field. However, if no filter formats are referenced, filter-format usage is not restricted (so it might be in use) and deleting a filter-format would not delete any fields. Thus, this "data-integrity issue" is not new, but the suggested change makes the behavior consistent in both cases.

    I must say, I don't see an issue with the "data-integrity" problem, as mentioned, that's intuitive to me when you delete a filter format like this. However, if we decide it is an issue that shall be tackled instead, we should also cover the case of un-referenced filter-formats which are currently not referenced in field configs at all. That sounds like a quite complicated problem to solve though.

  • Agreed with #19: if a filter format is improperly deleted somehow, this should not result in content data loss. If, for example, the project has the filter format configuration in version control, the format could be restored somewhat easily. But if all the data is gone, the restoration effort would be more difficult and would require database backups to be available.

    What are next steps? Should this be submitted for usability review?

Production build 0.71.5 2024