- Issue created by @mlncn
- πΊπΈUnited States mlncn Minneapolis, MN, USA
Problem identified: The text here:
the β10% in Vermontβ local investment program.
has multibyte characters for the fancy quotation marks, and that throws off
substr_replace
which is not multibyte safe. - πΊπΈUnited States mlncn Minneapolis, MN, USA
Surprisingly little discussion about dealing with this. There's https://github.com/wpsharks/comet-cache/issues/703 and,
from https://sourceforge.net/p/wikindx/svn/HEAD/tree/trunk/core/libs/UTF8.php comes one replacement function:
/** * Simulate substr_replace() for multibytes strings * * @param string $string * @param string $replacement * @param int $start * @param int $length Default is NULL. * @param string $encoding Default is NULL. * * @return string */ function mb_substr_replace($string, $replacement, $start, $length = NULL, $encoding = NULL) { $string_length = (is_null($encoding) === TRUE) ? mb_strlen($string) : mb_strlen($string, $encoding); if ($start < 0) { $start = max(0, $string_length + $start); } elseif ($start > $string_length) { $start = $string_length; } if ($length < 0) { $length = max(0, $string_length - $start + $length); } elseif ((is_null($length) === TRUE) || ($length > $string_length)) { $length = $string_length; } if (($start + $length) > $string_length) { $length = $string_length - $start; } if (is_null($encoding) === TRUE) { return mb_substr($string, 0, $start) . $replacement . mb_substr($string, $start + $length, $string_length - $start - $length); } return mb_substr($string, 0, $start, $encoding) . $replacement . mb_substr($string, $start + $length, $string_length - $start - $length, $encoding); }
But it turns out we could use an even more simple version of a simple version suggested in the PHP manual for substr_replace here https://www.php.net/manual/en/function.substr-replace.php#48066
- Status changed to Needs review
over 1 year ago 8:59pm 14 September 2023