[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

"glob.glob('weirdness')" Any thoughts?


On Sun, Sep 9, 2018 at 2:20 PM, Gilmeh Serda
<gilmeh.serdah at nothing.here.invalid> wrote:
>
> # Python 3.6.1/Linux
> (acts the same in Python 2.7.3 also, by the way)
>
>>>> from glob import glob
>
>>>> glob('./Testfile *')
> ['./Testfile [comment] some text.txt']
>
>>>> glob('./Testfile [comment]*')
> []
>
>>>> glob('./Testfile [comment? some text.*')
> ['./Testfile [comment] some text.txt']
>

The behaviour is stated rather clearly in the documentation:

For glob:
"No tilde expansion is done, but *, ?, and character ranges expressed
with [] will be correctly matched. This is done by using the
os.scandir() and fnmatch.fnmatch() functions in concert, and not by
actually invoking a subshell." [1]

And then for fnmatch, since that is used by glob:
"For a literal match, wrap the meta-characters in brackets. For
example, '[?]' matches the character '?'." [2]

Therefore glob('./Testfile [[]comment[]]*') is what you are looking
for. It should be straightforward to wrap all the meta-characters
which you want to use in their literal form in square brackets.
The results of your analysis are also stated in the documentation for
the glob patterns [1], so there is no guessing required. Your analysis
about escaping special characters is wrong though.
While backslashes are often used as escape characters, they are not
used in such a fashion everywhere. In this case they are not used as
escape characters, which makes a lot of sense when considering that
the directory separator in Windows is a backslash and additionally
using backslashes as escape characters would lead to quite some
confusion in this case.

[1] https://docs.python.org/3/library/glob.html
[2] https://docs.python.org/3/library/fnmatch.html