Multibyte-safe replacement for substr_replace()
Posted by jamie on 10 Feb 2011 in Activity log
While writing the function to grab an 'excerpt' of text from an utterance (i.e. a chunk of text surrounding a search term), I discovered that substr_replace()
is not multibyte-safe. So, if you use it with a multibyte string you might get strange results.
The excerpt()
function, which is otherwise very solid and flexible, is adapted from CakePHP's text helper. I had to replace these two lines:
$excerpt = substr_replace($excerpt, $ending, 0, $phraseLen);
$excerpt = substr_replace($excerpt, $ending, -$phraseLen);
With these multibyte-safe lines:
$excerpt = mb_substr($excerpt, 0, 0) . $ending . mb_substr($excerpt, $phraseLen + 1);
$excerpt = mb_substr($excerpt, 0, -$phraseLen) . $ending . mb_substr($excerpt, $textLen);
So, in a nutshell, here's how to convert your substr_replace()
call to a multibyte version:
// Original call
$string = substr_replace($text, $replacement, $start, $length);
// MB-safe call
$string = mb_substr($text, 0, $start) . $replacement . mb_substr($text, $length);