OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: zstd compression for packages


Am Dienstag, den 13.03.2018, 12:07 +1100 schrieb Daniel Axtens:
> 
> 
> On Tue, Mar 13, 2018 at 1:43 AM, Balint Reczey <balint.reczey@canonic
> al.com> wrote:
> > Hi Daniel,
> > 
> > On Mon, Mar 12, 2018 at 2:11 PM, Daniel Axtens
> > <daniel.axtens@xxxxxxxxxxxxx> wrote:
> > > Hi,
> > >
> > > I looked into compression algorithms a bit in a previous role,
> > and to be
> > > honest I'm quite surprised to see zstd proposed for package
> > storage. zstd,
> > > according to its own github repo, is "targeting real-time
> > compression
> > > scenarios". It's not really designed to be run at its maximum
> > compression
> > > level, it's designed to really quickly compress data coming off
> > the wire -
> > > things like compressing log files being streamed to a central
> > server, or I
> > > guess writing random data to btrfs where speed is absolutely an
> > issue.
> > >
> > > Is speed of decompression a big user concern relative to file
> > size? I admit
> > > that I am biased - as an Australian and with the crummy internet
> > that my
> > > location entails, I'd save much more time if the file was 6%
> > smaller and
> > > took 10% longer to decompress than the other way around.
> > 
> > Yes, decompression speed is a big issue in some cases. Please
> > consider
> > the case of provisioning cluoud/container instances, where after
> > booting the image plenty of packages need to be installed and
> > saving
> > seconds matter a lot.
> > 
> > Zstd format also allows parallel decompression which can make
> > package
> > installation even quicker in wall-clock time.
> > 
> > Internet connection speed increases by ~50% (according to this [3]
> > study which matches my experience)  on average per year which is
> > more
> > than 6% for every two months.
> > 
> > 
> The future is pretty unevenly distributed, and lots of the planet is
> stuck on really bad internet still.
> 
> AFAICT, [3] is anecdotal, rather than a 'study' - it's based on data
> from 1 person living in California. This is not really
> representative. If we look at the connection speed visualisation from
> the Akamai State of the Internet report [4], it shows that lots and
> lots of countries - most of the world! - has significantly slower
> internet than that person. 
> 
> (FWIW, anecdotally, I've never had a residential connection get
> faster (except when I moved), which is mostly because the speed of
> ADSL is pretty much fixed. Anecdotal reports from users in developing
> countries, and rural areas of developed countries are not encouraging
> either: [5].)
> 
> Having said that, I'm not unsympathetic to the usecase you outline. I
> just am saddened to see the trade-offs fall against the interests of
> people with worse access to the internet. If I can find you ways of
> saving at least as much time without making the files bigger, would
> you be open to that?
> 
> Regards,
> Daniel
> 
> [4] https://www.akamai.com/uk/en/about/our-thinking/state-of-the-inte
> rnet-report/state-of-the-internet-connectivity-visualization.jsp
> [5] https://danluu.com/web-bloat/

I want to mention that you can enable ultra compression levels 20 to 22
in zstd which usually achieve results comparable to the highest
compression levels of xz. There should be a level that matches the
results of xz -6 while still being faster than it.

Best regards,
Benjamin