David Wheeler <david@xxxxxxxxxxxxxx> writes:
> Oh, I remember now. If you use Encode to convert from CP1252 to
> UTF-8. At least I found that, in my tests, it worked properly:
>
>
> use Encode;
> $utf8_text = decode('cp1252', $cp1252)_text, 1);
>
> I was originally going to add support for converting from the CP1252
> gremlins to UTF-8, but when I found that Encode already did it
> properly, I eliminated it.
Encode::decode() doesn't allow you to pass in a Unicode string with
the gremlins in it. This is what we had here. What we want is to
convert from a string to another string, while what Encode provides is
conversion between bytes and strings.
The cp1252_fixup($text) function happens to be the same as
decode('cp1252', $text) when $text is Latin1, but not when $text is
Unicode.
--Gisle
|