logo       

Re: Hugs vs GHC (again) was: Re: Some random newbiequestions: msg#00086

lang.haskell.cafe

Subject: Re: Hugs vs GHC (again) was: Re: Some random newbiequestions

"Simon Marlow" <simonmar@xxxxxxxxxxxxx> writes:

> Here's a summary of the state of Unicode support in GHC and other
> compilers. There are several aspects:
>
> - Can the Char type hold the full range of Unicode characters?
> This has been true in GHC for some time, and is now true in Hugs.
> I don't think it's true in nhc98 (please correct me if I'm wrong).

You're wrong :-). nhc98 has always had 32-bit characters internally.

> - Do the character class functions (isUpper, isAlpha etc.) work
> correctly on the full range of Unicode characters? This is true in
> Hugs. It's true with GHC on some systems (basically we were lazy
> and used the underlying C library's support here, which is patchy).

In nhc98, currently the character class functions work only on the
8-bit Latin-1 range.

> - Can you use (some encoding of) Unicode for your Haskell source files?
> I don't think this is true in any Haskell compiler right now.

Many years ago, hbc claimed to be the only compiler with support for this.

> - Can you do String I/O in some encoding of Unicode? No Haskell
> compiler has support for this yet, and there are design decisions
> to be made. Some progress has been made on an experimental prototype
> (see recent discussion on this list).

Apparently some Haskell/XML toolkits already do I/O conversions in a
selection of the encodings permitted by the XML standard, namely ASCII,
Latin-1, UTF-8, and UTF-16 (either byte ordering), but not yet UCS-4
(four possible byte orderings), or EBCDIC. See for example:

http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/src/Text/XML/HaXml/Unicode.hs

> - What about Unicode FilePaths? This was discussed a few months ago
> on the haskell(-cafe) list, no support yet in any compiler.

Indeed, AFAIK.

Regards,
Malcolm


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise