Please take our Survey
logo       

Choosing A Webhost:
A web hosting service is a type of Internet hosting service that allows individuals and organizations to provide their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called colocation. more...

Re: draft-heath-ppp-v44-02.txt: msg#00027

ietf.pppext

Subject: Re: draft-heath-ppp-v44-02.txt

I don't think this is getting us anywhere. I have only seen the 200% plus
improvement on a couple very highly compressible files so it is very
atypical, forget I mentioned it. But I am not going to back down on the
claim of 20% to 100% improvement LZJH gets over LZW on typical internet
data, regardless of TCP, IP, PPP, etc. headers. I have tested with and
without such headers in the data. PPP or any other packet protocol isn't
going to change the basic fact that LZJH is a superior algorithm and if
you knew the differences between LZJH and LZW you would readily understand
why.

I am going to update the draft per James Carlsons comments and we can go
from there.

Jeff





Vernon Schryver <vjs@xxxxxxxxxxxxxxxxxxxx>
01/15/2003 04:25 PM


To: jheath@xxxxxxx
cc: ietf-ppp@xxxxxxxxx, james.d.carlson@xxxxxxxxxxxx,
karlfox@xxxxxxxxxxxxxxx,
narten@xxxxxxxxxx, owner-ietf-ppp@xxxxxxxxx
Subject: Re: draft-heath-ppp-v44-02.txt


> From: jheath@xxxxxxx

> ...
> I am talking about apples to apples comparisons over a packet network
> environment where a stream of IP sized 1500 byte packets is compressed
> between two peers where:
> --- typical internet data is used, such as mail, HTML, word, etc.
files.
> >From actual mail folders and web HTML's, etc.
> --- comparable memory usage for each algorithm. Thus a LZS 2048
history
> is compared to LZW (V.42bis) 2048 dictionary (11 bits) and LZJH (V.44)
> 2048 dictionary (11 bits) such that with tables, etc. the memory usage
is
> roughly equal +/- 20%.
>
> In that environment LZS and LZJH both get better compression than LZW
(in
> all my testing over the past 3 years) and LZJH gets better than LZS. LZS

> supports only a 2048 history so I guess if you compare it to LZW with a
15
> bit word size (32K dictionary) then it would do better than LZS. I have

> also tested LZW and LZJH against a LZS clone that supports various
history
> sizes and the results were similiar using comparable history and
> dictionary sizes.

I still wonder about the PPP implementations you tested, because
I'm not sure you are comparing apples (PPP compression) to apples
(PPP compression) instead of grapes (bulk compression such as LZS) to
watermelons (modem v.42bis). I do not recall hearing of a PPP
implementation that used v.42bis CCP propsoal. For that matter, I
don't recall that the v.42vis CCP proposal ever made it to an RFC.

Yes, in theory it is possible to make comparisons of PPP compression
outside complete implementations, but historically most such comparisons
have been at best based on obviously false and grossly misleading
assumptions. A typical bogus assumption is to assume that there are
no TCP/IP headers among the data.


> Using the above I have compared LZJH with LZW for all dictionary sizes
up
> through 13 bits and LZJH is clearly a superior algorithm. While a 200%
> improvement with LZJH certainly is not normal and is only with some
highly
> compressible files, almost all HTML will provide 60% - 100% improvement.


Please be more specific about the 200% difference between LZJH and
LZW. Frankly, I do not believe that is possible in an honest,
apples-to-apples comparison. I can believe no-history LZJH might do
200% better than 9 or 10-bit no-history LZW (clear on every packet),
but no one would implement such a silly thing.

> In an environment where the dictionary is cleared after every IP packet
> LZW is very inferior to LZJH and LZS.

Yes, but who cares about contrived and irrelevant tests? An
apples-to-apples comparison must be among RFC 1977, RFC 1967, and
something like your proposal.


> I agree that there are plenty of claims of compression performance all
> over the place. That is why you have to bound MIP's and memory to get
an
> apples to apples comparison. You can trash the ITU if you please but
> they were at least forward looking enough to:
> --- require competing compression algorithms to get at least 20% better
> compression than V.42bis over a set of typical internet files while
using
> no more than 20% more MIP's and no more than 20% more memory than
V.42bis
> (i.e. the new algorithm had to run on existing hardware). Tests were
run
> using dictionary sizes from 512 through 8192.

I hoped to make clear that I don't think the ITU is significantly
vulnerable to smoke, mirrors, and marketing baloney than any other
standards organization including the IETF.

It is good to do tests, but they must be as you say, apples-to-apples.
Comparing v.42bis to your CCP-LZJH implies nothing about how your
CCP-LZJH compares to any IETF PPP compression protocol that I can see.


> --- recognize that LZJH was a better compression solution and adopt it
as
> V.44 despite the fact that virtually 100% of the worlds modems at that
> time had V.42bis. They felt that as new modems were built V.44 would be

> phased in. The fact that V.42bis was the current de facto choice did
not
> stop the ITU from adopting something better.

That is a point in favor of the ITU only if you assume your conclusion,
that LZJH is better than v.42bis. I'm inclined to assume that LZJH
is better than v.42bis, but that's not relevant to anything I see.

> --- recognize that the Calgary and Canterbury corpus were not
> representative of data being downloaded in todays internet and create a
> set of test files that are representative.

If you look at the archives for this mailing list, you'll find that
those were not the only sets of test data consider or even the most
influental.

> --- run comparison tests against HTML files from web sites worldwide
with
> many different languages.

That strikes me as a distinctly minor and probably bogus consideration
for the test sets. Picking the test data is not easy, but it is quite
wrong to involve political correctness.


> I would assume that the IETF is forward looking enough want to allow PPP

> implementors the choice if something better comes along regardless of
the
> number of current CCP options and regardless of a current de facto
choice.
> Most will run their own comparisons and decide for themselves which is
> best based on MIP's, memory, or other contraints which they may have.

That is mistaken. In the network world outside the ISO or ITU dream
world, considerations of MIPs, memory, and so forth are minor compared
to other issues starting with interoperation. (Yes, *this* is a slap
at the standard old ISO OSI/ITU delusions of grandure that what a
standards committee considers important and desirable must matter
outside hotel meeting rooms.)


> I would be happy to provide you with the source of LZJH if you want to
run
> your own comparisons.

Thanks, but I trust that if you did apples-to-apples comparison, you
would report the results accurately.


> Yes, it is going to be an Informational RFC.

That short-circuits most of my concerns.


Vernon Schryver vjs@xxxxxxxxxxxx






<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
qnx.openqnx.dev...    gcc.libstdc++.c...    solaris.opensol...    information-ret...    misc.misterhous...    web.catalyst.ge...    apache.webservi...    redhat.release....    hardware.lirc/2...    kernel.autofs/2...    technology.sust...    linux.vdr/2003-...    editors.lyx.gen...    org.user-groups...    netbsd.devel.pk...    xdg.devel/2004-...    version-control...    jakarta.slide.d...    debian.packages...    creativecommons...    ports.ppc.embed...    bug-tracking.bu...   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe