Descriptions with newline characters cause invalid ics files to be generated

Created on 27 September 2024, 5 months ago

Problem/Motivation

My university uses Drupal 10 for their website. They have an "add to calendar" button on each event's page that downloads an ICS file. I noticed that both Thunderbird and Google Calendar did not like how this file was formatted.

I identified multiple issues with it, some of them were fixed (i believe adding the "Z" to datetime values in version 4.0.7, commit 820c86d8115873df01f3999121443fc38d8c79ab was one of the fixes, so consider that a lower bound on what version is in use)

However I still believe there is an issue regarding the description field. It seems like the event description is being provided to date_cal as text containing multiple lines and these are being inserted into the ics in a way that isnt quite following the specification (link to the relevant section of the technical documentation for what im talking about: https://www.rfc-editor.org/rfc/rfc5545#section-3.1).

Essentially, if the description is more than a single line long, the second line, and any subsequent lines should be inserted into the ICS file beginning with a single space (if the line was broken in the middle of a word) or potentially two spaces (if the description contains a space at the location that the description was split at).

Steps to reproduce

Unfortunately I don't have admin permissions so i cannot test these steps

1. create an event with a description that contains at least one newline
2. try to export it to ICS
3. try to import the ICS into a calendar program

Expected result: ICS file correctly handles the newline in the description

Likely actual result: The file doesnt contain the spaces according to the spec and fails to import

Proposed resolution

I believe it is reasonable to include some additional handling of descriptions to this plugin (it seems as though theres already some basic HTML sanitization happening) to properly handle multi-line descriptions when adding them to a generated ICS file

I'm willing to help with this but am a new contributor so I may need some guidance on how to set up an environment to be able to test this (either with unit testing or inside an actual drupal instance)

Remaining tasks

User interface changes

None needed

API changes

None needed

Data model changes

None needed

πŸ› Bug report
Status

Active

Version

4.0

Component

iCal Export

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @moralcode
  • πŸ‡«πŸ‡·France lazzyvn paris

    I think you can test and report to iCalcreator
    it uses $event->setDescription($longText) so this method must handle rfc5545
    line 88 // render description, just remove all unsupported HTML tags
    I try multiline description on iPhone, it works fine.

  • πŸ‡¬πŸ‡§United Kingdom stevewilson

    I’m not sure that my issue is quite the same as that reported here, but I too am finding that the DESCRIPTION property is not being formatted as expected in the .ics feed.

    I’ve attached an example .ics file (saved as .txt for uploading as: calendar-test-d10.txt β†’ ) created in Drupal 10 (10.3.10) with Date iCal 4.0.10. If you open this you will see that the DESCRIPTION property contains HTML paragraph tags where I was expecting to see /n as the paragraph separator. Also, I would have expected the HTML <strong> tag to have been removed.

    When entered into the calendar on my Android phone/tablet, all is well, but when loaded into the Microsoft Outlook calendar on my desktop PC, the HTML tags are not recognised as such and appear, uninterpreted (see screen shot of Outlook entry attached: calendar-test-d10_in_Outlook.png β†’ ).

    My setup in Drupal 10, as far as I can see, mimics that which I have been using successfully in Drupal 7 for years without issue. The same calendar entry generated there (using Date iCal 7.x-3.12) has all HTML tags removed and the expected /n to separate paragraphs (see calendar-test-d7.txt β†’ attached). This renders correctly in both Outlook and my Android devices.

    In the above cases I have a single content field configured as the DESCRIPTION property. I have found; however, that if the iCal view mode display for my content type is set to display that same, single field and I include this as a "Rendered entity" field in my iCal feed view and set this as the DESCRIPTION property then, in addition to the DESCRIPTION property in the .ics feed, I see also the X-ALT-DESC property (see calendar-test-render-d10.txt β†’ ). When I load that feed it renders AOK in both Outlook calendar and in my Android devices (Outlook screenshot: calendar-test-render-d10_Outlook.png β†’ ).

    That’s great, but I want to be able to create some calendar feeds with the DESCRIPTION property set to be the rendered entity and others with an individual content field configured directly as the DESCRIPTION property – and with the X-ALT-DESC property present only when utilising the rendered entity method, the directly configured individual content field method isn't working for me.

    Am I failing to configure something correctly or is there really an issue when an individual content field is configured directly as the DESCRIPTION property, rather than a rendered entity being configured as the DESCRIPTION ?

  • πŸ‡«πŸ‡·France lazzyvn paris

    In your description field settings you can select option strip the HTML tag.
    Most of the new generation ical feeds accept HTML code (no test on outlook but you can test on latest version or to the new outlook on windows 11) If only Outlook doesn't support HTML in the description, the module date ical won't support it.

  • πŸ‡¬πŸ‡§United Kingdom stevewilson

    Interesting. Which standard are "Most of the new generation ical feeds" working to? Clearly not rfc5545.

    I had already tried stripping HTML tags from the description field; the DESCRIPTION property then displays as a single paragraph, which is really not acceptable, particularly as I generally "rewrite" 2 or 3 fields to generate the DESCRIPTION property.

    I'm surprised that you say "If only Outlook doesn't support HTML in the description, the module date ical won't support it" as you are already doing so - to a degree - note the mention of Microsoft in the following section of date_ical/src/DateICal.php:

    // Render description.
          if (!empty($field['description_field'])) {
            $description = trim(strip_tags(html_entity_decode($field['description_field']), '<a><b><u><strong><ul><ol><li><br><hr><h5><h4><h3><h2><h1><p>'));
            $event->setDescription(preg_replace('/(\s{2,})/', ' ', $description));
            // Check html add X-ALT_DESC support VCal Microsoft.
            if ($description != $field['description_field']) {
              $event->setXprop('X-ALT-DESC;FMTTYPE=text/html', preg_replace('/(\s{2,}|\r?\n)/', ' ', $field['description_field']));
            }
          }
    

    I'm finding that when the description field is a rendered entity, production of the X-ALT-DESC property is triggered, which Outlook uses in preference to the DESCRIPTION property, and all is well. If the X-ALT-DESC property were created in all cases, my issue would, I believe, be resolved - but maybe there's a reason why its use is limited.

    On a positive note, I now realise that I can achieve everything I need by using the rendered "iCal" view mode as the description field for one calendar export, and the "Teaser" view mode for another - so my issue is now resolved, thank you.

  • πŸ‡«πŸ‡·France lazzyvn paris

    I tested all the ical readers which I have except outlook for the description field. I found that most of them support html with basic tags. So of course I chose to leave the tags supported without removing all the tags. outlook has very few users. when you render an entity, there is more than the basic tag, so it will add X-ALT-DESC. Why i have to limit X-ALT-DESC because it will create duplicate content and it crashes sometimes (if the content is too long) you can see it render X-ALT-DESC on one line another reader will crashes (I don't remember exactly, maybe Thunderbird)

  • πŸ‡¬πŸ‡§United Kingdom stevewilson

    Apologies for returning to this but I've noticed an occasional anomaly when viewing calendar entries in the Samsung calendar app on my phone – a HTML tag is sometimes visible e.g. </a>, </p>. Investigation has shown that HTML tags are not always properly represented within the calendar feed. Anomalies are seen when line folding, according to rfc5545, occurs within a tag. Some examples (once again as .txt files):

    1. calendar-trailing-p.txt β†’ - Here, a requirement for line folding occurs between the "<" and "/" of a trailing </p> tag. The "<" and ">", have here been converted to &lt; and &gt; and, on my phone, the tag is visible as </p>.
    2. calendar-trailing-mid-strong.txt β†’ - In this case, line folding is required within a trailing </strong> tag, but not immediately following the leading "<". Here, the </strong> tag has been removed completely. The Samsung calendar app appears not to recognise the <strong> tag anyway, and is unaffected by the loss of the trailing tag. The modified trailing </p> tag, seen in example 1, is also seen in this example, once again this tag is visible in the Samsung calendar entry.
    3. calendar-leading-strong.txt β†’ - Here, a line is folded immediately following the opening "<" of a leading <strong> tag. As with the </p> tag in example 1, the "<" and ">", have here been converted to &lt; and &gt; and, on my phone, the leading <strong> tag is made visible. The trailing </strong> tag is ignored.

    Whilst these examples were all captured with the description field configured as a rendered entity, the same behaviours are seen with a simple content field as the description field. The same behaviour can also be seen in the X-ALT-DESC property, and as a consequence HTML tags are occasionally visible in my Outlook calendar. I don't have ready access to other calendar apps so can't say how they might respond to these modified HTML tags, but they will all be receiving them wherever line folding occurs within a HTML tag.

    Line folding is, I presume, performed by iCalcreator. Whether the behaviour I'm seeing is caused by a bug in iCalcreator, or whether iCalcreator is simply not designed to accommodate HTML as an input I cannot say, but it seems clear that there is currently an incompatibility between Date iCal and iCalcreator, which should be investigated further. I am not a developer, so doubt that I can assist with that, but am happy to be involved in the testing of any changes that may be proposed.

Production build 0.71.5 2024