- last update
over 1 year ago 139 pass, 2 fail - Status changed to Needs work
over 1 year ago 5:10am 27 June 2023 - last update
over 1 year ago 140 pass, 2 fail - πΊπΈUnited States dcam
As I dug into this the first thing that I noticed is that my site is importing authors. The difference is that the items it's importing have the
<dc:creator>
tag instead of<author>
. My own test feed is generated by Drupal 10's default frontpage view. The RSS version is 2.0. But the one that's being tested by the module is version 0.91. It seems like we aren't compatible with this older version of RSS.I think we could keep the scope of this issue as it is, but this makes me think that we may need to spin off another issue to test different RSS versions separately.
- πΊπΈUnited States dcam
I did further research. Essentially, laminas-feed has strict requirements for the contents of the
<author>
field. I believe they are following the guidance from https://www.rssboard.org/rss-2-0-1:<author>
is an optional sub-element of<item>
.It's the email address of the author of the item. For newspapers and magazines syndicating via RSS, the author is the person who wrote the article that the
<item>
describes. For collaborative weblogs, the author of the item might be different from the managing editor or webmaster. For a weblog authored by a single individual it would make sense to omit the<author>
element.<author>lawyer@boyer.net (Lawyer Boyer)</author>
See the relevant lines of code from laminas-feed at https://github.com/laminas/laminas-feed/blob/4efdf913e7bb93310f9afd5d426.... They just try to do a best effort attempt to parse
<author>
fields, expecting that it will contain an email address and possibly a person's name in parenthesis. This matches the syntax example shown above. They start by checking for an '@', assuming that it is an email address. If they don't find one, then they don't bother doing anything else with the data. It's effectively discarded.There's a further problem in how Aggregator uses the author data. Any email address is stored in an
email
key. If there's a name in parenthesis, that gets stored in thename
key. We only use thename
key from the results. So even if there's data in theemail
key, we ignore it. This may have been done because the laminas-feed docs only demonstrate using thename
key.Based on this, it doesn't seem to be an RSS version-specific problem. laminas-feed simply doesn't support random data in an
<author>
field. It's probably fine in<dc:creator>
fields because laminas-feed hands off that processing to a separate library. So I guess it's our move to decide how we want to handle this. - πΊπΈUnited States dcam
I expanded the test from #8 to check for authors that are compatible with laminas-feed. So my new assertion should pass. It does locally.
- last update
over 1 year ago 139 pass, 4 fail - πΊπΈUnited States dcam
I updated DeleteFeedItemTest to account for the new item.
- last update
over 1 year ago 140 pass, 2 fail - πΊπΈUnited States dcam
I'm open to suggestions about what to do about this. On one hand this is probably undesirable behavior. On the other hand we would have to do extra work outside of what laminas-feed does for us in order to import arbitrary
<author>
data. Do we just document the problem and say we don't support arbitrary<author>
data? Or do we try to fix the problem? - πΊπΈUnited States dcam
Based on these findings I don't think this qualifies as being Major.
- Status changed to Needs review
about 1 year ago 6:01am 1 October 2023 - last update
about 1 year ago 153 pass, 2 fail - πΊπΈUnited States dcam
I'm uploading my work to make sure it's saved. I know it won't pass the new test.
I decided to go beyond what laminas-feed does and try to import any data that's in the
<author>
, if it's populated. To do this I've created an extension for laminas Entry parsing.Unfortunately, it isn't working and I don't know why. The extension and function for getting the author are called. But there's something wrong with my implementation or the underlying code. The base
AbstractEntry
class contains anentryKey
property that tells any extension which<item>
from the feed is being parsed. For some reason, in this new extension theentryKey
is 1 (technically a value of 0), either because that's the default or because of some bug. But we're trying to parse number 4. So the extension gets the XML for entry 1, tries to parse the author from it (it has none), and fails the test. The last submitted patch, 17: 3037567-17.patch, failed testing. View results β
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.- Status changed to Needs work
about 1 year ago 6:16am 1 October 2023 - πΊπΈUnited States dcam
Inspiration for this came from the official docs, https://docs.laminas.dev/laminas-feed/reader/#writing-laminasfeedreader-..., and from the other extensions. Unfortunately, I don't see where we're doing anything different. I'm starting to think that it's because the RSS Entry class only sets the
entryKey
for extensions that it expects out-of-the-box. We may have to set the key ourselves. I'd already thought of that, but dismissed it as a workaround hack. But that may be the only way to get the job done. - Status changed to Needs review
about 1 year ago 3:45pm 1 October 2023 - last update
about 1 year ago 158 pass - πΊπΈUnited States dcam
It just needed one small change to the check for existing author data.
- πΊπΈUnited States dcam
And the moral of the story is: don't debug late at night when you're tired.
- Status changed to RTBC
about 1 year ago 10:54pm 3 October 2023 - π¦πΊAustralia larowlan π¦πΊπ.au GMT+10
Looks good to me, now I need to finish reading the great gatsby for I too want to get out and walk eastward toward the park through the soft twilight
- last update
about 1 year ago 158 pass - Status changed to Fixed
about 1 year ago 11:27pm 3 October 2023 Automatically closed - issue fixed for 2 weeks with no activity.