[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] Inclusion of lz4 bindings in stdlib?

On Wed, Nov 28, 2018 at 10:43 AM Gregory P. Smith <greg at krypto.org> wrote:

> On Wed, Nov 28, 2018 at 9:52 AM Brett Cannon <brett at python.org> wrote:
>> Are we getting to the point that we want a compresslib like hashlib if we
>> are going to be adding more compression algorithms?
> Lets avoid the lib suffix when unnecessary.  I used the name hashlib
> because the name hash was already taken by a builtin that people normally
> shouldn't be using.  zlib gets a lib suffix because a one letter name is
> evil and it matches the project name. ;)  "compress" sounds nicer.
> ... looking on PyPI to see if that name is taken:
> https://pypi.org/project/compress/ exists and is already effectively what
> you are describing.  (never used it or seen it used, no idea about quality)
> I don't think adding lz4 to the stdlib is worthwhile.  It isn't required
> for core functionality as zlib is (lowest common denominator zip support).
> I'd argue that bz2 doesn't even belong in the stdlib, but we shouldn't go
> removing things.  PyPI makes getting more algorithms easy.
> If anything, it'd be nice to standardize on some stdlib namespaces that
> others could plug their modules into.  Create a compress in the stdlib with
> zlib and bz2 in it, and a way for extension modules to add themselves in a
> managed manner instead of requiring a top level name?  Opening up a
> designated namespace to third party modules is not something we've done as
> a project in the past though.  It requires care.  I haven't thought that
> through.
> -gps

While my gut reaction was to say "no" to adding lz4 to the stdlib above...

I'm finding myself reconsidering and not against adding lz4 to the stdlib.

I just want us to have a good reason if we do. This type of extension
module tends to be very easy to maintain (and you are volunteering). A good
reason in the past has been the algorithm being widely used.  Obviously the
case with zlib (gzip and zipfile), bz2, and lzma (.xz).  Those are all
slower and tighter though.  lz4 is extremely fast, especially for
decompression.  It could make a nice addition as that is an area our
standard library offers nothing.

So change my -1 to a +0.5.

Q: Are there other popular alternatives to fill that niche that we should
strongly consider instead or as well?

5 years ago the answer would've been Snappy.  15 years ago the answer
would've been LZO.

I suggest not rabbit-holing this on whether we should adopt a top level
namespace for these such as "compress".  A good question to ask, but we can
resolve that larger topic on its own without blocking anything.

lz4 has claimed the global pypi lz4 module namespace today so moving it to
the stdlib under that name is normal - A pretty transparent transition.  If
we do that, the PyPI version of lz4 should remain for use on older CPython
versions, but effectively be frozen, never to gain new features once lz4
has landed in its first actual CPython release.


>> On Wed, 28 Nov 2018 at 08:44, Antoine Pitrou <solipsis at pitrou.net> wrote:
>>> On Wed, 28 Nov 2018 10:28:19 +0000
>>> Jonathan Underwood <jonathan.underwood at gmail.com> wrote:
>>> > Hi,
>>> >
>>> > I have for sometime maintained the Python bindings to the LZ4
>>> > compression library[0, 1]:
>>> >
>>> > I am wondering if there is interest in having these bindings move to
>>> > the standard library to sit alongside the gzip, lzma etc bindings?
>>> > Obviously the code would need to be modified to fit the coding
>>> > guidelines etc.
>>> Personally I would find it useful indeed.  LZ4 is very attractive
>>> when (de)compression speed is a primary factor, for example when
>>> sending data over a fast network link or a fast local SSD.
>>> Another compressor worth including is Zstandard (by the same author as
>>> LZ4). Actually, Zstandard and LZ4 cover most of the (speed /
>>> compression ratio) range quite well. Informative graphs below:
>>> https://gregoryszorc.com/blog/2017/03/07/better-compression-with-zstandard/
>>> Regards
>>> Antoine.
>>> _______________________________________________
>>> Python-Dev mailing list
>>> Python-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/python-dev
>>> Unsubscribe:
>>> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20181129/52e3f9d7/attachment.html>