|
Re: Data.ByteString candidate 3: msg#00157lang.haskell.libraries
On 25.04 13:46, John Meacham wrote: > I think all we really need are > > Data.ByteString > Data.PackedString > > (Though, I suppose Latin1 could be useful) Using the Word8 API is not very pleasant, because all character constants etc are not Word8. As for Latin1 - what semantics do we use for toUpper/toLower and Ord? Using the unicode ones or locale seems the sensible thing if the data really is Latin1. Thus a simple wrapper to the Word8 api is desirable. Make it follow few simple rules: * c2w . w2c = id (conversion is a bijection) * ascii characters translated correctly * toLower/toUpper for ascii * Ord by byte values. This is very useful for many purposes and does not mean that there should not be a fancy UTF8 module. Rather than arguing about killing this, wouldn't it be more productive to create the UTF8 module? > but note, do the people that want latin1 just need ASCII? because it should be > noted that if we have a UTF8 PackedString, then we can make > ASCII-specific access routines that are just as fast as the ones in the > Latin1 variety without giving up the ability to store full unicode > values in the string. Case conversions and ordering need to be different. Thus we need to newtype things to avoid having two conflicting Ord instances. The UTF8 layer should provide: * Unicode toUpper/toLower * Unicode collation (UCA) for Ord * Graphemes (see Perl6 for good ways to do this) * Normalisation - Einar Karttunen |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Unsafe Functions: 00157, Ashley Yakeley |
|---|---|
| Next by Date: | Re: Unsafe Functions: 00157, Donald Bruce Stewart |
| Previous by Thread: | Re: Data.ByteString candidate 3i: 00157, John Meacham |
| Next by Thread: | Re: Data.ByteString candidate 3: 00157, John Meacham |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |