logo       

Re: [Python-Dev] standard library mimetypes module pathologically broken?: msg#00667

python-dev

Subject: Re: [Python-Dev] standard library mimetypes module pathologically broken?



On Fri, Jul 31, 2009 at 15:38, Jacob Rus <jacobolus@xxxxxxxxx> wrote:
Brett Cannon wrote:
> Jacob Rus wrote:
>> Â* It defines __all__: I didnât even realize __all__ could be used
>> Â Âfor single-file modules (w/o submodules), but it definitely
>> Â Âshouldnât be here.
>
> __all__ is used to control what a module exports when used in an import *,
> nothing more. Thus it's use in a module compared to a package is completely
> legitimate.
>
>> This specific __all__ oddly does not include
>> Â Âall of the documented variables and functions in the mimetypes
>> Â Âclass. Itâs not clear why someone calling import * here wouldnât
>> Â Âwant the bits not included.
>
> If something is documented by not listed in __all__ that is a bug.

In this case, everything in the module is documented, including parts
that should be private, but only a small number are in __all__. ÂMy
recommendation would be to make those private parts be _ variables and
remove them from the docs (using them has no legitimate use cases I
can see), and rip out __all__.

Well, if the module had stuff that did not lead with an underscore then you can't remove it. You can deprecate it under the old name and rename it with an underscore, but backwards-compatibility says someone out there is using those functions so you can't just batch rename them w/o the proper warning.
Â

>> Â* It creates a _default_mime_types() function which declares a
>> Â Âbunch of global variables, and then immediately calls
>> Â Â_default_mime_types() below the definition. There is literally
>> Â Âno difference in result between this and just putting those
>> Â Âvariables at the top level of the file, so I have no idea why
>> Â Âthis function exists, except to make the code more confusing.
>
> It could potentially be used for testing, but that's a guess.

Here's an abridged version of this function. I donât think thereâs any
reason for this that I can see.

 Âdef _default_mime_types():
   Âglobal suffix_map
   Âglobal encodings_map
   Âglobal types_map
   Âglobal common_types

   Âsuffix_map = {
     Â'.tgz': '.tar.gz', #...
     Â}

   Âencodings_map = {
     Â'.gz': 'gzip', #...
     Â}

   Âtypes_map = {
     Â'.a'   Â: 'application/octet-stream', #...
     Â}

   Âcommon_types = {
     Â'.jpg' : 'image/jpg', #...
     Â}

 Â_default_mime_types()

As R. David pointed out, it is being used by regrtest to clean up after running the test suite.
Â

> Probably came from someone who is very OO happy. Not everyone comes to
> Python ready to embrace its procedural or slightly functional facets.

Yes, it seems so to me too.

> So the problem of changing fundamentally how the code works, even for a
> cleanup, is that it will break someone's code out there because they
> depended on the module's crazy way of doing things. Now if they are cheating
> and looking at things that are meant to be hidden you might be able to clean
> things up, but if the semantics are exposed to the user, then there is not
> much we can do w/o breaking someone's code.

The problem is that the semantics as documented are really ambiguous,
and what I would consider the reasonable interpretation is different
from what the code actually does. So anyone using this code naively is
going to run into trouble, and anyone relying on how the code actually
works is going behind the back of the docs, but they sort of have to
in order to use much of the functionality of the module. I agree this
puts us in a tricky spot.

Well, perhaps the docs can be updated to match the code where cleanup would change the semantics.
Â

> Honestly, if the code is as bad as it seems -- including its API --, the
> best bet would be to come up with a new module for handling MIME types from
> scratch, put it up on the Cheeseshop/PyPI, and get the community behind it.
> If the community picks it up as the de-facto replacement for mimetypes and
> the code has settled we can then talk about adding it to the standard
> library and begin deprecating mimetypes.
> And thanks for willing to volunteer to fix this.

Okay. ÂWell I'd still like to hear a bit about what people really need
before trying to make a new API. I'm not such an experienced API
designer, and I havenât really plumbed the depths of mimetypes use
cases (though it seems to me like quite a simple module of not more
than 100 lines of code or so would suffice).

I'm sure you can get help from the community with any of this.
Â
At the very least, I
think some changes can be made to this code without altering its basic
function, which would clean up the actual mime types it returns,
comment the exceptions to Apache and explain why they're there, and
make the code flow understandable to someone reading the code.

That all sounds reasonable.

-BrettÂ
_______________________________________________
Python-Dev mailing list
Python-Dev@xxxxxxxxxx
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/maillists%40codeha.us
Google Custom Search

News | Mail Home | sitemap | FAQ | advertise