- πΊπΈUnited States smustgrave
This issue is being reviewed by the kind folks in Slack, #needs-review-queue-initiative. We are working to keep the size of Needs Review queue [2700+ issues] to around 400 (1 month or less), following Review a patch or merge request β as a guide.
For the tests requested in #4
- Status changed to Active
3 months ago 4:11pm 29 March 2025 - πΊπΈUnited States dcam
I verified the problem on my Windows laptop. This is still an issue with the latest 11.x-dev.
The issue summary has been updated with information about the actual problem. This isn't the result of PHP "counting new line characters twice." The problem is that in Windows new lines are literally two characters, a carriage return and line feed (
\r\n
). JavaScript and browsers ignore the\r
when calculating string length. PHP counts them both.Thus, the former proposed fix and patch in #3 are unacceptable solutions. They would cause strings to pass form validation, but would eventually result Drupal trying to save strings that are too long into database fields whose sizes are set to the given maxlengths. That would result in even worse errors, probably WSoDs.
I don't know what the best solution is to the problem. A framework manager will have to decide.
- πΊπΈUnited States dcam
I wrote a very basic FunctionalJavascript test that demonstrates the issue. The test fails because it finds the server-side validation error message after submitting the form. It will have to be modified appropriately after a solution for the bug has been decided, but it demonstrates that there's a problem for now. I wanted to make it verify the contents of the textarea field before submitting the form, but I don't know how. This is literally the first FunctionalJavascript test that I've written.
Anyway, I'm leaving the Needs Tests tag since it will have to be improved or modified. But I'm changing the status to Needs Review to get the attention of a maintainer for an opinion on a fix.
- πΊπΈUnited States dcam
Form validation issues are described as having Major status.
- πΊπΈUnited States smustgrave
Not 100% sure putting into review will grab the attention.
Though it's good to have the hard part done in a test.
Will post in slack though to try and kick start it.
Looking at the HTML5 Standard, the maxlength attribute for a form control is validated by its "API Value".
In the case of textarea elements, the API value and value differ. In particular, newline normalization is applied before the maximum allowed value length is checked (whereas the textarea wrapping transformation is not applied).
And the newline normalization converts "\r\n" and "\r" to "\n". So in PHP, the maxlength should be validated against the submitted value with the same newline normalization.
This likely affects submissions regardless of the user's brower/OS. Also per the HTML5 standard, the submitted form values should have "\n" and "\r" normalized to "\r\n". So the form values in PHP should have "\r\n" exclusively. Testing on Chrome in MacOS, I can validate that this is what I've observed. Putting a breakpoint on these lines in
Drupal\Core\Form\FormValidator::performRequiredValidation()
:// Verify that the value is not longer than #maxlength. if (isset($elements['#maxlength']) && mb_strlen($elements['#value']) > $elements['#maxlength']) { $form_state->setError($elements, $this->t('@name cannot be longer than %max characters but is currently %length characters long.', [ '@name' => empty($elements['#title']) ? $elements['#parents'][0] : $elements['#title'], '%max' => $elements['#maxlength'], '%length' => mb_strlen($elements['#value']), ])); }
If I only hit "return" 20 times in a textarea with
#maxlength
20 and submit the form, then the value of$elements['#value']
is "\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n", andmb_strlen($elements['#value'])
is 40.I think best way forward is to do the same newline normalization in
Drupal\Core\Render\Element\Textarea::valueCallback()
, so that the maxlength validation is equivalent on the frontend and backend. I think it also makes sense to store this normalized value, so that there isn't an error if the same character limit is applied on the database column, but it needs to be validated that when the value is retrieved and presented back to browsers, that this doesn't create any issues across browsers/OSes.- πΊπΈUnited States dcam
Updated the issue summary with the appropriate solution.
- πΊπΈUnited States dcam
I implemented the prescribed fix per the HTML5 standard. The FunctionalJavascript test was reverted in favor of Unit test cases in the existing TextareaTest.
Because of the browser behavior described in #19, I think it makes sense to include a Browser test after all. I think it'd also make sense to test that the saved value(s) do not exceed DB column sizes, so I think the test could look like this:
- Create an content entity type with a
text
field (Drupal\text\Plugin\Field\FieldType\TextItem) and set the maxlength storage setting to something low - Visit entity create form and submit the form where the text field value length should equal the maxlength. This actually should be done three times, with "\r\n", "\n", and "\r" separately included in the text value
- Save should occur without a database exception
- Load the entity from database storage after save and confirm that the text contains "\n" and not "\r\n" or "\r"
- Create an content entity type with a
- πΊπΈUnited States dcam
@godotislate this issue is about textarea elements. It looks like core's fields with textarea elements don't have settings for maxlength attributes. Am I missing something?
Ah right, I'd forgotten that the textarea widget isn't used on
text
field items. Maybe the unit test is enough, so putting to RTBC.Applied the MR suggestion and rebased. Since the change is minor and there were no other suggestions, going to push this back to RTBC.
- Status changed to RTBC
13 days ago 3:15pm 29 June 2025 - π¬π§United Kingdom alexpott πͺπΊπ
Some thoughts while reviewing this issue.
- Is it really okay to change user input. Normally Drupal saves what the user enters and makes changes on output. This is pretty much how the filter system works. We do have precedence for changing the value on input TextField does
return str_replace(["\r", "\n"], '', $input);
. - Should we make the change to \Drupal\Core\Form\FormValidator::performRequiredValidation so any contrib or custom implementations benefit given that that is where max length validation occurs.
- What about \Drupal\Core\Validation\Plugin\Validation\Constraint\LengthConstraint and the constraints - looking at our text field types we are not using constraints - but we probably should at some point.
- This will make values submitted via forms behave different from JSON:API - which will be surprising.
I'm not sure what all these thoughts mean at this point but I just wanted to note them down.
- Is it really okay to change user input. Normally Drupal saves what the user enters and makes changes on output. This is pretty much how the filter system works. We do have precedence for changing the value on input TextField does
Is it really okay to change user input. Normally Drupal saves what the user enters and makes changes on output.
The reason to do that here is that it's possible that database storage has the same character limit, so without converting "\r\n" to "\n", it's possible there could be a database error if it passes JS and PHP validation. I don't think core has any instance of character-limited storage for a field that has textareas used for entry, but it seems possible for something in contrib or custom code.
Also, the browser is already converting "\n" or "\r" to "\r\n" on submit, so technically for users on non-Windows clients, the input they entered has already been changed.
I think I did try awhile ago to see if other web projects might be handling "\r\n" for similar concerns, but didn't find anything conclusive.
- π¬π§United Kingdom alexpott πͺπΊπ
Also, the browser is already converting "\n" or "\r" to "\r\n" on submit, so technically for users on non-Windows clients, the input they entered has already been changed.
Really? Yes it does.... I've confirmed that with a debugger. This is quite messed up right? What a PITA. FWIW though how real is it to enforce max length on the database level for a field using a text area? And is doing that even correct. I don't think it is. Sure you can enforce max length on the field level but backing a text area field with anything other than big text field feels like you are opening a can of worms.
Also I think that point 4 from #27 is important.
Really? Yes it does.... I've confirmed that with a debugger. This is quite messed up right? What a PITA.
Yes, this was surprising to me as well, but apparently it is HTML spec. See #19 π Maxlength validation counts newlines twice Needs work .
Sure you can enforce max length on the field level but backing a text area field with anything other than big text field feels like you are opening a can of worms.
I don't think I necessarily disagree, esp for sites with content translation, but is it obvious? #5 π Maxlength validation counts newlines twice Needs work and #13 π Maxlength validation counts newlines twice Needs work brought up the storage concerns.
Also, it's not uncommon for stakeholders to request to have title (replacement) fields that support formatted text and new lines. This could be implemented as a TextItem field, with an alter to allow a textarea widget to be used. But maybe it's too edge-casey?Also I think that point 4 from #27 is important.
Agreed, not sure about what to do about that and open to ideas. Is maxlength checked at all for JSON:API requests, since the validation is done in FormValidator?
- π¦πΊAustralia larowlan π¦πΊπ.au GMT+10
In
\Drupal\Core\Render\Element\Textfield::valueCallback
we havereturn str_replace(["\r", "\n"], '', $input);
which unless I'm mistaken is also problematic here because in the case of windows,\r\n
becomes\n\n
and would result in a different max-length validation to that of the browser - should we be handling that (And any other text-like inputs like telephone, machine name etc) here as well - the issue title doesn't specifically mention text areas. In
\Drupal\Core\Render\Element\Textfield::valueCallback
we havereturn str_replace(["\r", "\n"], '', $input);
That
str_replace()
strips out every instance of\r
or\n
from the input, so\r\n
would be stripped out as well. Though technically, textfields and other<input>
elements don't allow new lines in input: https://stackoverflow.com/a/42744064. That's a stackoverflow answer, but from a quick Google, official resources seem to say the same without being as declarative.Doing some git archaeology, I found that that
str_replace
comes from way back in #53480: required checkbox not stylized like a required field β , and it was for form input hardening (see #6 β ):We remove linebreaks from mere textfields, there is no way this breaks user data, only hackers are interested. This puts many header injection attacks at rest.
If we need to add similar hardening for non-form submissions to telephone, machine_name, etc., I think that'd be a separate issue?
Back to textareas, a couple thoughts:
I don't think there are any field widgets in core that set the
#maxlength
property on aTextarea
form element object. So if we gate replacing"/\r\n?/"
with"\n"
based on whether#maxlength
is set, something like this:
return isset($element['#maxlength']) ? preg_replace("/\r\n?/", "\n", (string) $input) : (string) $input;
then values submitted via JSON:API requests will be the same as content entity form submissions, except for contrib/custom field widgets that set#maxlength
on the Textarea. Arguably it would be on the devs implementing the field widgets, and possibly their respective field types, to add the appropriate Constraints and normalize the JSON:API data.An alternative idea is to add a kernel request event subscriber that goes through all properties in
$request->query
and$request->request
, and request content as well for requests with JSON bodies, and replace all\r\n
with\n
. But doing this for request content is tricky, and I think this approach overall might be overly aggressive.- πΊπΈUnited States dcam
I don't think there are any field widgets in core that set the #maxlength property on a Textarea form element object.
There aren't, unless something changed recently. I checked before writing #23.
- π¦πΊAustralia larowlan π¦πΊπ.au GMT+10
Gating it behind fields with a maxlength is a great idea - nice one
- First commit to issue fork.
- π¬π§United Kingdom oily Greater London
Some work needed on the test case. Add newlines to the random text and test how that affects the length?
Gating it behind fields with a maxlength is a great idea - nice one
Added this to the MR and additional tests. Ready for review again.
- π¬π§United Kingdom oily Greater London
@godotislate I have looked at your new kernel test coverage. Can you please advise me if you think I should refactor my test coverage 'borrowing' from your code to take account of new lines? I am not sure my test coverage is currently fit for purpose?
Can you please advise me if you think I should refactor my test coverage 'borrowing' from your code to take account of new lines?
I don't think it's easily accomplishable in a unit test. As for the rest, I would just leave it as is.
- π¬π§United Kingdom oily Greater London
#29 +1. I realised I had come across this before with Symfony textarea field. I wanted to allow the user to enter a list, actually Black Cab runs which students of the Knowledge study and memorise, for example,
Leave on Left Green Lanes
Right Highbury New Park
etc
etc
etc
Forward into Gibson SquareAll the runs follow a similar pattern, but they vary a lot in length (how many lines in total).
You have a choice to either present one textbox per line or a textarea for all the lines. Thinking it through I realised that the newlines are easily handled by explode() on the TEXT (string) value. Then custom validation can be applied to the array. The flexibility of the textarea field type, including the length of the field value, is what seems to make it so useful.
I think there might be edge cases where a maxlength value is desirable. But IMO it is best to leave the core textarea field as it is. Developers can create a custom field or use custom validation. But such cases will be relatively infrequent.