osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Python 3.2 has some deadly infection


On Fri, 06 Jun 2014 02:21:54 +0300, Marko Rauhamaa wrote:

> Steven D'Aprano <steve+comp.lang.python at pearwood.info>:
> 
>> In any case, I reject your premise. ALL data types are constructed on
>> top of bytes,
> 
> Only in a very dull sense.

I agree with you that this is a very dull, unimportant sense. And I think 
it's dullness applies equally to the situation you somehow think is 
meaningfully exciting: Text is made of bytes! If you squint, you can see 
those bytes! Therefore text is not a first class data type!!!

To which my answer is, yes text is made of bytes, yes, you can expose 
those bytes, and no your conclusion doesn't follow.

 
>> and so long as you allow applications *any way* to coerce data types to
>> different data types, you allow them to see "inside the black box".
> 
> I can't see the bytes inside Python objects, including strings, and
> that's how it is supposed to be.

That's because Python the language doesn't allow you to coerce types to 
other types, except possibly through its interface to the underlying C 
implementation, ctypes. But Python allows you to write extensions in C, 
and that gives you the full power to take any data structure and turn it 
into any other data structure. Even bytes.


> Similarly, I can't (easily) see how files are laid out on hard disks.
> That's a true abstraction. Nothing in linux presents data, though,
> except through bytes.

Incorrect. Linux presents data as text all the time. Look at the prompt: 
its treated as text, not numbers. You type commands using a text 
interface. The commands are made of words like ls, dd and ps, not numbers 
like 0x6C73, 0x6464 and 0x7073. Applications like grep are based on line-
based files, and "line" is a text concept, not a byte concept.

Consider:

[steve at ando ~]$ echo -e '\x41\x42\x43'
ABC


The assumption of *text* is so strong in the echo application that by 
default you cannot enter numeric escapes at all. Without the -e switch, 
echo assumes that numeric escapes represent themselves as character 
literals:

[steve at ando ~]$ echo '\x41\x42\x43'
\x41\x42\x43



-- 
Steven D'Aprano
http://import-that.dreamwidth.org/