|
Re: FPS/Data.ByteString candidate: msg#00152lang.haskell.libraries
On Tue, 2006-04-25 at 22:34 +1000, Donald Bruce Stewart wrote: > ross: > > On Tue, Apr 25, 2006 at 12:08:45PM +0300, Einar Karttunen wrote: > > > The name Latin1 is particularly bad since there are many other > > > single byte encodings around. > > > > The name is quite appropriate, since that is the particular encoding of > > Char that is exposed by the interface. What's bad is that there's no > > choice. Calling it Latin1 is just being honest about that, and leaving > > room for modules with other encodings or an interface parameterized > > by encoding. > > Ok. Duncan, Ketil, Ross and Simon make good points here. > I'll move Data.ByteString.Char -> Data.ByteString.Latin1 If you want to justify that and provide some concrete spec you can add something like the following to the Data.ByteString.Latin1 docs: Manipulate ByteStrings using Char operations. All Chars will be truncated to 8 bits. More specifically these byte strings are taken to be in the subset of Unicode covered by code points 0-255. This covers Unicode Basic Latin, Latin-1 Supplement and C0+C1 Controls. See: http://www.unicode.org/charts/ http://www.unicode.org/charts/PDF/U0000.pdf http://www.unicode.org/charts/PDF/U0080.pdf One reason to be so specific is that other definitions of character sets commonly called "Latin-1" omit the control characters and so do not cover all bytes 0-255. I think this allows us to justify reinterpreting Word8s as Chars and getting valid Unicode code points. Duncan |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Data.ByteString candidate 3: 00152, Simon Marlow |
|---|---|
| Next by Date: | Re: FPS/Data.ByteString candidate: 00152, Duncan Coutts |
| Previous by Thread: | Re: FPS/Data.ByteString candidatei: 00152, Donald Bruce Stewart |
| Next by Thread: | Re: FPS/Data.ByteString candidate: 00152, Duncan Coutts |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |