@JosJuice, here is what I think what happens:
He modifies the string containing the utf-8 characters using string.sub( (or something else). Since this is intented for ascii, it modifies the wrong bytes which results in the weird characters. He needs to use string.usub to edit it as utf-8.
Ah, now I understand. So string.sub( attempts to modify a single byte of a multi-byte character?