Migrating "Categories" on Drupal.org Project pages

Created on 23 August 2023, 6 months ago
Updated 22 February 2024, 3 days ago

Problem/Motivation

We need to migrate the current "Categories" associated with a module project on Drupal.org to the new list of Categories and to take into account that we will be instituting a limit on the number of categories selected to three.

Proposed resolution

With the limited cardinality of three (3) Categories allowed to be selected per module project, we propose simply truncating the existing list to the first three categories selected. (NOTE: ideally, if we can notify maintainers ahead of time to clean up their categories, all the better.)

*Notes:

  • The “compliance” category has no categories mapping to it. After a survey of the top modules that belong in this category, they are currently misclassified and should be re-classified manually after the migration.
  • True “example” modules belong in Developer Tools; however, the bulk of modules categorized with this are not true developer examples, so we recommend dropping and manually allowing true example modules to be tagged correctly after the migration.
  • Some of the deleted terms we intend to resurrect as a separate filter called “ecosystem” which will show modules that depend on but extend another core component or contributed module (i.e. Views or Webform)
  • Ones marked "Delete" are to be deleted and we don't care (because that data isn't migrating anywhere). That is, they could be deleted at any time (ex.g. prior to the migration). The ones marked "delete-after-migration" have data in them, but once we migrate them, should have 0 nodes associated with that term, and can then be deleted.

Deployment

Drop categories:

  • Community
  • Content Construction Kit (CCK)
  • Drush
  • Education
  • Examples
  • Features Package
  • Mobile
  • Novelty
  • Organic Groups (OG)
  • Other
  • Project management
  • RDF
  • Theme Enhancements
  • Views

Rename & describe categories:

  • Content Access Control - Rename to "Access Control"
  • Administration - Rename to "Administration Tools"
  • Rules - Rename to "Automation"
  • Content - Rename to "Content Editing Experience"
  • Developer - Rename to "Developer Tools"
  • Import/Export - Rename to "Import and Export"
  • Third-party Integration - Rename to "Integrations"
  • Performance and Scalability - Rename to "Performance"
  • SEO - Rename to "Search Engine Optimization (SEO)"
  • Search - Rename to "Site Search"
  • Fields - Rename to "Site Structure"
  • Evaluation/Rating - Rename to "User Engagement"

Merge categories:

  • User Access & Authentication - Merge into "Access Control"
  • User Management - Merge into "Access Control"
  • Path Management - Merge into "Administration Tools"
  • Paging - Merge into "Content Display"
  • Filters/Editors - Merge into "Content Editing Experience"
  • Database Drivers - Merge into "Developer Tools"
  • JavaScript Utilities - Merge into "Developer Tools"
  • Multisite - Merge into "Developer Tools"
  • Utility - Merge into "Developer Tools"
  • Commerce/Advertising - Merge into "E-Commerce"
  • Migrate - Merge into "Import and Export"
  • Syndication - Merge into "Import and Export"
  • Mail - Merge into "Integrations"
  • Statistics - Merge into "Integrations"
  • Spam Prevention - Merge into "Security"
  • Event - Merge into "Site Structure"
  • Location - Merge into "Site Structure"
  • Site Navigation - Merge into "Site Structure"
  • Taxonomy - Merge into "Site Structure"
  • Games and Amusements - Merge into "User Engagement"
  • File Management - Merge into "Media"

(The following categories simply need the description added):

  • Accessibility
  • Content Display
  • Decoupled
  • E-commerce
  • Media
  • Multilingual
  • Security

Finally, add descriptions to the final set of terms:

  • Access Control - Grant or restrict access to content, assets, or site functionality, or extend the authentication/login process.
  • Accessibility - Enhance the site to provide a great user experience to the broadest range of people or help to audit for compliance with accessibility standards like the Web Content Accessibility Guidelines (WCAG).
  • Administration Tools - Empower site builders and administrators with no-code tools to setup, enhance, configure, or maintain the site.
  • Automation - Enable the site to initiate automated actions from conditions, events, or defined schedules.
  • Content Display - Configure the layout and format of content and data presented to site visitors.
  • Content Editing Experience - Enhance the editorial interface and improve the processes and workflows around creating, editing or removing content.
  • Decoupled - Support the idea of separating front-end and backend concerns by integrating Drupal to external or third-party frameworks for display.
  • Developer Tools - Empower developers with tools that assist with developing and debugging the frontend or backend of the site.
  • E-Commerce - Assist with aspects of running an online store, such as product management and display, shopping carts, inventory management, fulfillment, payments, taxes, and shipping.
  • Import and Export - Help transfer content and data into or out of the site, by migration, backup, or exposing data to external systems.
  • Integrations - Use a third-party CSS or JS Framework, a self-hosted service like a CRM, or a third-party service with the site.
  • Legal Compliance - Help protect users' privacy by anonymizing or encrypting data, or ensuring compliance with local laws and regulations, such as GDPR or Terms & Conditions.
  • Media - Enhance functionality related to media, or expand media resource types, such as images, videos, audio files, or documents.
  • Multilingual - Provide tools for translation and display of text in multiple languages and support for regionalization/localization for dates, numbers, currency, measurement, or other local contexts.
  • Performance - Improve the real or perceived speed of the site, or monitor performance metrics.
  • Search Engine Optimization - Manage or improve the site's search engine ranking by running audits, assessing metrics, or making the site's content and data more digestible by search engines.
  • Security - Help protect the website from attackers or bad actors, by identifying, preventing, or mitigating security vulnerabilities.
  • Site Search - Enhance functionality relating to the search of content and data on the site.
  • Site Structure - Extend the structure of the site by way of content models, data storage, field types, and navigation, so it is more understandable to users.
  • User Engagement - Enhance the site so that visitors can directly interact with it or among each other, enabling things like user-generated content, comments, voting, chat, or forms for data collection and interaction.
🌱 Plan
Status

Fixed

Version

3.0

Component

Code

Created by

🇺🇸United States leslieg

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @leslieg
  • 🇺🇸United States leslieg
  • 🇺🇸United States leslieg
  • 🇺🇸United States leslieg
  • 🇺🇸United States leslieg
  • 🇺🇸United States leslieg
  • 🇪🇸Spain fjgarlin

    What the difference between "delete" and "delete after migrate"?

  • 🇪🇸Spain fjgarlin

    Also, as per @drumm via slack

    Please be sure to move the issue, not make a new one. And in a format we can copy/paste/manipulate from

  • 🇺🇸United States chrisfromredfin Portland, Maine
  • 🇺🇸United States chrisfromredfin Portland, Maine
  • 🇺🇸United States drumm NY, US

    Are there descriptions for the categories that are being kept?

  • 🇺🇸United States drumm NY, US

    I updated the issue summary with stubs of the ideal format. For implementing this, it will be 3 phases, delete old categories, update categories being kept, merge categories that will be merged. It would be good to have the lists in that format, including the category descriptions.

  • 🇺🇸United States chrisfromredfin Portland, Maine

    I have updated the summary with nearly all of the information, but getting consensus from the group whether or not to keep decoupled-type-things in import/export or make it a separate term; and if separate, we still need to get a definition/description for that one, and update the import/export one.

  • 🇺🇸United States leslieg

    The consensus for descriptions for Decoupled and "Import and export" is as follows:

    Decoupled - Support the idea of separating front-end and backend concerns by integrating Drupal to external or third-party frameworks for display

    Import and Export - Help transfer content and data into or out of the site, by migration, backup, or exposing data to external systems.

    @drumm. I'll update the lists to be in the format you recommended above, with the descriptions included

  • 🇺🇸United States leslieg

    Updated descriptions for Decoupled and Import and Export in the Issue Summary.

  • 🇺🇸United States drumm NY, US

    Are all changes documented and ready here?

  • Status changed to Needs review 20 days ago
  • 🇺🇸United States chrisfromredfin Portland, Maine

    Good question; but now looking and seeing what we did with Decoupled, I can confirm that YES, we're ready on this ticket. :)

  • Assigned to drumm
  • 🇺🇸United States drumm NY, US

    Attached is a dry run of the removals, which will serve as a backup. Script used is:

    $terms_to_remove = [
      56 => 'Community',
      88 => 'Content Construction Kit (CCK)',
      4654 => 'Drush',
      19440 => 'Examples',
      11478 => 'Features Package',
      7404 => 'Mobile',
      16190 => 'Novelty',
      90 => 'Organic Groups (OG)',
      51425 => 'Other',
      19984 => 'Project management',
      116 => 'RDF',
      73 => 'Theme Enhancements',
      89 => 'Views',
    ];
    $tids_to_remove = array_keys($terms_to_remove);
    $result = (new EntityFieldQuery())->entityCondition('entity_type',  'node')
    ->propertyCondition('type', 'project_module')
    ->fieldCondition('taxonomy_vocabulary_3', 'tid', $tids_to_remove)
    ->execute();
    foreach (array_chunk(array_keys($result['node']), 50) as $nids) {
      foreach (node_load_multiple($nids) as $node) {
        print $node->nid . ' ' . $node->title . ': removing';
        $removed = [];
        foreach ($node->taxonomy_vocabulary_3[LANGUAGE_NONE] as $delta => $item) {
          if (in_array($item['tid'], $tids_to_remove)) {
            print ' ' . $item['tid'] . ':' . $terms_to_remove[$item['tid']];
            $removed[] = $terms_to_remove[$item['tid']];
            unset($node->taxonomy_vocabulary_3[LANGUAGE_NONE][$delta]);
          }
        }
        print PHP_EOL;
        $node->revision = TRUE;
        $node->log = t('Removing module categories, @removed, which are being discontinued, see https://www.drupal.org/project/drupalorg/issues/3383004', [
          '@removed' => implode(', ', $removed)
        ]);
        node_save($node);
      }
    }
  • 🇺🇸United States drumm NY, US

    Content - Rename to "Content Editor Tools"

    But the description is

    Content Editing Experience - Enhance the editorial interface and improve the processes and workflows around creating, editing or removing content.

    Please update the issue summary to correct one of those

  • 🇺🇸United States drumm NY, US

    Moving file management into merges, since there is already another media category.

  • 🇺🇸United States drumm NY, US

    The description for Education is missing

  • 🇺🇸United States chrisfromredfin Portland, Maine

    Updated summary:

    1. Good catch. "Content Editor Tools" should be "Content Editor Experience" wherever that appears; have updated.
    2. Education - mis-transcribed from the spreadsheet. That should just be a drop. Updated summary.
  • 🇺🇸United States TR Cascadia

    Question about the ecosystem filter plan from the issue summary:

    Is that something that is going to be rolled out imminently, or do we just lose functionality for some unknown (possibly VERY lengthy) period of time?

    Specifically, the "Rules" category used to be only for the Rules ecosystem. This is now renamed to "Automation" which means that a lot of modules and submodules unrelated to the Rules module are now lumped into the same category.

    Automation now has 426 (!!) modules. Most of which are not part of the Rules ecosystem.

    With no "ecosystem" filter, extensions to the Rules module now become a lot harder to find and you have to wade through hundreds of modules to find them.

    This is a major step backwards. At least from the Rules standpoint. I feel like I need to now do all this searching myself then go and modify the Rules homepage to list all the compatible modules, like we used to have to do back in the Drupal 5 days, in order to help users find ecosystem modules.

  • 🇺🇸United States drumm NY, US

    Education is now being removed, the log is attached.

    The "ecosystem" functionality is already on Drupal.org. On https://www.drupal.org/project/rules , click "Projects that extend this" in the right sidebar. Each project can be updated to say it extends Rules. ("Ecosystem" is a bit of an abstract term, we should try to keep it out of UIs and elsewhere.)

    We might want to pause and identify categories that should be migrated to ecosystems as well. We could update every project in the Automation, formerly Rules, category to say they extend rules.

  • 🇺🇸United States drumm NY, US

    Automation now has 426 (!!) modules. Most of which are not part of the Rules ecosystem.

    To be clear, no merges have happened yet, so those 426 modules are the ones that were in the Rules category. We probably should not migrate them to the ecosystem field, since most of them are not related to Rules.

  • 🇺🇸United States drumm NY, US

    I didn’t see a category with good data that might be a candidate for migrating to ecosystems. Like the Rules/Automation example, categories use is just not restricted enough to match up to a single module.

    The merges are now running, log of a dry run is attached. Using code:

    $terms_to_merge = [
      74 => 13434,
      76 => 13434,
      8818 => 53,
      68 => 58,
      63 => 57,
      13158 => 59,
      101 => 59,
      26738 => 59,
      75 => 59,
      55 => 104,
      196950 => 64,
      70 => 64,
      66 => 52,
      119 => 52,
      7266 => 69,
      61 => 20224,
      65 => 20224,
      124 => 20224,
      71 => 20224,
      122 => 60,
      62 => 67,
    ];
    $terms = taxonomy_term_load_multiple(array_merge(array_keys($terms_to_merge), $terms_to_merge));
    foreach ($terms_to_merge as $key => $value) {
      print $terms[$key]->name . ' → ' . $terms[$value]->name . PHP_EOL;
    }
    $tids_to_merge = array_keys($terms_to_merge);
    $result = (new EntityFieldQuery())->entityCondition('entity_type',  'node')
    ->propertyCondition('type', 'project_module')
    ->fieldCondition('taxonomy_vocabulary_3', 'tid', $tids_to_merge)
    ->execute();
    foreach (array_chunk(array_keys($result['node']), 50) as $nids) {
      foreach (node_load_multiple($nids) as $node) {
        print $node->nid . ' ' . $node->title . ': merging';
        $merged = [];
        $existing_terms = array_column($node->taxonomy_vocabulary_3[LANGUAGE_NONE], 'tid');
        foreach ($node->taxonomy_vocabulary_3[LANGUAGE_NONE] as $delta => $item) {
          if (in_array($item['tid'], $tids_to_merge)) {
            print ' ' . $item['tid'] . ':' . $terms[$item['tid']]->name . '→' . $terms[$terms_to_merge[$item['tid']]]->name . '-';
            if (in_array($terms_to_merge[$item['tid']], $existing_terms)) {
              print 'exists';
              unset($node->taxonomy_vocabulary_3[LANGUAGE_NONE][$delta]);
            }
            else {
              print 'update';
              $node->taxonomy_vocabulary_3[LANGUAGE_NONE][$delta]['tid'] = $terms_to_merge[$item['tid']];
              $existing_terms = array_column($node->taxonomy_vocabulary_3[LANGUAGE_NONE], 'tid');
            }
            $merged[] = $terms[$item['tid']]->name . '→' . $terms[$terms_to_merge[$item['tid']]]->name;
          }
        }
        print PHP_EOL;
        $node->revision = TRUE;
        $node->log = t('Updating module categories, @merged, see https://www.drupal.org/project/drupalorg/issues/3383004', [
          '@merged' => implode(', ', $merged)
        ]);
        node_save($node);
      }
    }
  • Status changed to Fixed 11 days ago
  • 🇺🇸United States drumm NY, US

    This is now all done. Search indexing is still catching up, I'll continue to monitor it and motivate it if needed tomorrow.

    Thanks everyone!

  • 🇺🇸United States TR Cascadia

    The "ecosystem" functionality is already on Drupal.org. On https://www.drupal.org/project/rules , click "Projects that extend this" in the right sidebar. Each project can be updated to say it extends Rules. ("Ecosystem" is a bit of an abstract term, we should try to keep it out of UIs and elsewhere.)

    I have to disagree - that is fundamentally different. (BTW, these arguments apply to any ecosystem, not just Rules.)

    "Projects that extend this" has been around for a while and is not very useful because it is an opt-in categorization made and controlled by other modules. That doesn't help when it comes to building ecosystems or finding modules that are part of an ecosystem.

    In the case of Rules, Rules can't simply designate other modules that depend on Rules, and drupal.org does not automatically check dependencies to figure out which modules depend on Rules.

    Regardless, with the Plugin system, modules that support Rules DO NOT have a dependency on Rules in order to provide Rules functionality. The same is true of Drush commands and a large part of Drupal 8+ functionality.

    The problem remains: There is NO LONGER a way to find modules that work with Rules. And there's a similar problem (to a greater or lesser extent) with every other ecosystem.

    I have nothing against the new categories, but I do have a problem with throwing away functionality that we've depended on for at least 10 years.

  • 🇺🇸United States chrisfromredfin Portland, Maine

    Hi TR, since we're looking at ecosystem as a separate thing, we'd welcome your constructive feedback on that related issue, so we can let this one close.

    We'll have an opportunity to look at how to truly implement/re-implement ecosystem in an even better way!

    https://www.drupal.org/project/project_browser/issues/3381187 🌱 [META] Plan/Proposal to Implement Ecosystem Active

    (And, related more specifically to PB) -
    https://www.drupal.org/project/project_browser/issues/3241544 Add filter by project dependencies (ecosystem) Active
    https://www.drupal.org/project/project_browser/issues/3365593 Define how to implement Ecosystem for Project Browser Active

  • 🇺🇸United States drumm NY, US

    Both categories and ecosystem are updated by the downstream modules by editing their project page. The only difference is ecosystem is an autocomplete for the project, like rules, vs selecting a category from a list.

    Before re-arranging categories, the former rules category did have 426 modules, many not related to or integrating with Rules module. The ecosystem selection has the potential to be more accurate, since maintainers aren’t looking at a list of categories, often not knowing what “rules” might mean as a category.

Production build https://api.contrib.social 0.61.6-2-g546bc20