Prevent flooding of Planet Drupal with old blog posts

Created on 5 November 2023, over 1 year ago

Problem/Motivation

Regularly, a lot of old items from a feed will be included on Planet Drupal, flooding it with up to 50 posts.

  • Nov 5, 2023: 10 posts, 3C Web Services
  • Nov 1, 2023: 20 posts, Lullabot
  • Oct 24, 2023: 9 posts, ADCI Solutions
  • Oct 16, 2023: 6 posts, Dropsolid Experience Agency
  • Oct 6, 2023: 50 (!) posts, Electric Citizen

I seem to remember someone mentioned this happened to them after updating from Drupal 9 to Drupal 10?

Steps to reproduce

Visit https://www.drupal.org/planet for the latest news on Drupal, and get outdated articles from 2017.

Proposed resolution

  • Limit feed items to 10, to mitigate the damage when it happens? Add in Planet Drupal guidelines
  • Update the aggregator module to prevent this?

Remaining tasks

User interface changes

API changes

Data model changes

🐛 Bug report
Status

Active

Version

3.0

Component
Views 

Last updated about 1 hour ago

Created by

🇩🇰Denmark ressa Copenhagen

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @ressa
  • 🇩🇰Denmark ressa Copenhagen
  • Status changed to Postponed: needs info over 1 year ago
  • 🇺🇸United States drumm NY, US

    Update the aggregator module to prevent this?

    This is something that should be fixed in Drupal core. What are the steps to reproduce the issue? What exactly in the feed's XML is changing for every item?

  • 🇩🇰Denmark ressa Copenhagen

    It happened again today, 10 old posts from TEN7 at the top.

    Maybe @apaderno has an opinion about this, since he often takes care of Planet Drupal applications?

  • 🇩🇰Denmark ressa Copenhagen

    It happened again.

  • Status changed to Active over 1 year ago
  • 🇩🇰Denmark ressa Copenhagen

    ... and again. The entire first page of Drupal Planet is taken over by Gbyte with old posts.

    Setting to "Active" since I don't know how to debug this, but it's important to get fixed, and having it as Postponed will easily make it be overlooked.

  • 🇩🇪Germany gbyte Berlin

    I did update gbyte.dev from D9 to D10 yesterday so this is definitely a good hunch. I'll take a look as soon as I find the time. I take it there is no need to take any immediate action from my side?

  • 🇩🇰Denmark ressa Copenhagen

    Thanks for a fast reply and confirming the "after updating to D10"-hunch, which then seems confirmed. I have created 📌 Clean up Drupal Planet for old posts Active , so that part is taken care of.

    If you have any idea how to fix this in Aggregator for a permanent fix, that would be nice :)

  • 🇺🇸United States dave reid Nebraska USA

    This is a core bug in Drupal 10, let me find it.

  • 🇺🇸United States dave reid Nebraska USA

    🐛 Views RSS Feed Fields adds tag. Active is the core bug causing this

  • 🇩🇰Denmark ressa Copenhagen

    Thanks for the link @Dave Reid. Do you think Aggregator in Drupal 7 should be updated to be able to handle malformed tags in this issue, or should it get fixed in that other issue?

  • 🇩🇰Denmark ressa Copenhagen

    Another one.

  • 🇩🇰Denmark ressa Copenhagen
  • 🇸🇰Slovakia poker10

    If the problem is that in the pubDate element is an additional markup that violates the RSS specification:

    <pubDate><time datetime="2023-08-10T13:43:13-07:00" class="datetime">Thu, 10 Aug 2023 13:43:13 -0700</time></pubDate>
    

    Then I think this needs to be fixed on the views side, not in the aggregator module. Aggregator module has a fallback, that uses current date in case no valid date is provided:

        // Try to resolve and parse the item's publication date.
        $date = '';
        foreach (array('pubdate', 'dc:date', 'dcterms:issued', 'dcterms:created', 'dcterms:modified', 'issued', 'created', 'modified', 'published', 'updated') as $key) {
          if (!empty($item[$key])) {
            $date = $item[$key];
            break;
          }
        }
    
        $item['timestamp'] = strtotime($date);
    

    I do not think it would be good to change this logic in this D7 phase.

  • 🇺🇸United States drumm NY, US

    Link to the spec - https://www.rssboard.org/rss-specification#ltpubdategtSubelementOfLtitemgt

    So it looks like the <time> element was introduced at some time. Following the git blame from https://git.drupalcode.org/project/drupal/-/blob/7cb99a11aca57643e8d66fc... looks like that code has been functionally unchanged for many years. The date field rendering it uses likely introduced the extra element.

  • 🇩🇰Denmark ressa Copenhagen

    Ten more today, added to the list. Also, making it a child issue of 🐛 Views RSS Feed Fields adds tag. Active , since fixing that would also stop Drupal Planet from getting flooded.

  • Status changed to Closed: duplicate over 1 year ago
  • 🇩🇰Denmark ressa Copenhagen
  • 🇩🇰Denmark ressa Copenhagen

    I had a look at the Aggregator module, and ordering the Planet Drupal views list by feed item post date descending might fix this problem, so adding this as a possible solution in the Issue Summary.

  • Status changed to Active about 2 months ago
  • 🇩🇰Denmark ressa Copenhagen
  • 🇮🇹Italy apaderno Brescia, 🇮🇹

    On /admin/config/services/aggregator/settings, there are no settings for ordering the posted articles.

    Since /planet is a path alias for /aggregator/categories/2, the output of that page is produced from the Aggregator module. There is no way to alter how to order articles listed there.
    I cannot say the output on that page is altered in some way. I suspect it is so from the Blog text that appears on the top of the page. (The Aggregator module does not seem to categorize its output in such way.)

  • 🇩🇰Denmark ressa Copenhagen

    Forgive me for leaving this important bit out, but my example was based on a quick View I threw together in Drupal 11, using the contrib Aggregator module (I updated the Issue Summary).

    I added the most important parts, and re-created the /planet content structure. Here is a sample, using three random feeds as sources:

    19 January 2025 | philipnorton42

    Creating queues using the core queue classes in Drupal is fairly straightforward. You just need…

    16 January 2025 | Irina Khramtsova

    Drupal CMS is up and running. Built through community collaboration, it’s now more accessible…

    6 January 2025 | p.johnson

    Experience the much anticipated new Drupal CMS by testing the Release Candidate now - whether…

    5 January 2025 | philipnorton42

    When writing data to the queue database system Drupal will convert the information to a string…

    22 December 2024 | philipnorton42

    I've talked a lot about the Batch API in Drupal recently, and I've mentioned that it is built…

    9 December 2024 | Irina Khramtsova

    Endorsed by Dries Buytaert in his keynote at DrupalCon Singapore, the Search Track is driving…

    21 November 2024 | Irina Khramtsova

    Learn how the Event Platform module simplified building the DrupalCamp Berlin 2024 website, its…

    18 November 2024 | breidert

    Explore how the integration of Drupal Recipes is transforming Drupal development by simplifying…

    15 November 2024 | Irina Khramtsova

    10 years after DrupalCity Berlin 2014 the community kicked-off another DrupalCamp in the heart…

    12 November 2024 | Irina Khramtsova

    DrupalCamp Berlin 2024 united 150+ Drupal enthusiasts, showcasing AI innovations, inspiring…

  • 🇩🇰Denmark ressa Copenhagen
  • 🇮🇹Italy apaderno Brescia, 🇮🇹

    If you are proposing to use a view to show the articles added to Planet Drupal (and the regional variants), this is something that need to be proposed for the Drupal.org customizations project. None of the Aggregator module's settings allows to set a view to show the added articles.

  • 🇩🇰Denmark ressa Copenhagen

    Thanks for the tip, I have updated the "Project" value.

  • 🇮🇹Italy apaderno Brescia, 🇮🇹

    Using a view, it would also be possible to remove duplicates, I guess, or articles that do not match some criteria. It would resolve some issues that occasionally Planet Drupal have.
    It would no longer be possible to re-categorize the articles added to Planet Drupal, which is what the Aggregator module allows, and which allows to move an article from Planet Drupal to Planeta Latinoamericano, for example.

  • 🇩🇰Denmark ressa Copenhagen

    You're right, it would give us a lot of flexibility, and new tools.

    Just for a start, it would prevent the flooding of the frontpage with up to 50 old posts, since these articles would simply be placed very last. And I guess, also prevent duplicates like you write?

    Interesting that it has been possible to move or recategorize posts ... Maybe the "flooding articles" could have been taken care of that way? Anyway, this will probably not become relevant again, since if a View is used, old posts are sorted by their actual publishing date, and placed very last.

    I look forward to drupal.org in Drupal 10, so that we can make this happen :)

Production build 0.71.5 2024