[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Python 3.2 has some deadly infection

On Thursday, June 5, 2014 9:42:28 PM UTC+5:30, Chris Angelico wrote:
> On Fri, Jun 6, 2014 at 1:33 AM, Steven D'Aprano wrote:
> > In the Unix world, text formats and text
> > processing is much more common in user-space apps than binary processing.
> > Perhaps the definitive explanation and celebration of the Unix way is
> > Eric Raymond's "The Art Of Unix Programming":
> > http://www.catb.org/esr/writings/taoup/html/ch05s01.html

> Specifically, this from the opening paragraph:
> """
> Text streams are a valuable universal format because they're easy for
> human beings to read, write, and edit without specialized tools. These
> formats are (or can be designed to be) transparent.
> """

A fact that stops being true when you tie up text with encodings.
For two reasons:

1. The function/pair encode/decode mapping between byte-string and text 
   cannot be a bijection because the byte-string set is larger than the text
   set.  This is the error that Armin was hit by

2. Since there is not one but a zillion encodings possible we are not
   talking of one (possibly universal) data structure but a zillion
   ones: "Text streams are a universal format" - which encoding-ed
   form of text??