[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Why exception from os.path.exists()?

On Tue, Jun 5, 2018 at 11:11 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Tue, 05 Jun 2018 20:15:01 +1000, Chris Angelico wrote:
>> On Tue, Jun 5, 2018 at 5:37 PM, Steven D'Aprano
>> <steve+comp.lang.python at pearwood.info> wrote:
>>> On Mon, 04 Jun 2018 22:13:47 +0200, Peter J. Holzer wrote:
>>>> On 2018-06-04 13:23:59 +0000, Steven D'Aprano wrote:
>>> [...]
>>>>> I don't know whether or not the Linux OS is capable of accessing
>>>>> files with embedded NULs in the file name. But Mac OS is capable of
>>>>> doing so, so it should be possible. Wikipedia says:
>>>>> "HFS Plus mandates support for an escape sequence to allow arbitrary
>>>>> Unicode. Users of older software might see the escape sequences
>>>>> instead of the desired characters."
>>>> I don't know about MacOS. In Linux there is no way to pass a filename
>>>> with an embedded '\0' (or a '/' which is not path separator) between
>>>> the kernel and user space. So if a filesystem contained such a
>>>> filename, the kernel would have to map it (via an escape sequence or
>>>> some other mechanism) to a different file name. Which of course means
>>>> that - from the perspective of any user space process - the filename
>>>> doesn't contain a '\0' or '/'.
>>> That's an invalid analogy. According to that analogy, Python strings
>>> don't contain ASCII NULs, because you have to use an escape mechanism
>>> to insert them:
>>>     string = "Is this \0 not a NULL?"
>>> But we know that Python strings are not NUL-terminated and can contain
>>> NUL. It's just another character.
>> No; by that analogy, a Python string cannot contain a non-Unicode
>> character. Here's a challenge: create a Python string that contains a
>> character that isn't part of the Universal Character Set.
> Huh? In what way is that the analogy being made? Your challenge is
> impossible from pure Python, equivalent to "create a Python bytes object
> that contains a byte greater than 255". The challenge is rigged to be
> doomed to fail.

And an ASCIIZ string cannot contain a byte value of zero. The parallel is exact.