|
|
Choosing A Webhost: |
Re: draft-heath-ppp-v44-02.txt: msg#00027ietf.pppext
I don't think this is getting us anywhere. I have only seen the 200% plus improvement on a couple very highly compressible files so it is very atypical, forget I mentioned it. But I am not going to back down on the claim of 20% to 100% improvement LZJH gets over LZW on typical internet data, regardless of TCP, IP, PPP, etc. headers. I have tested with and without such headers in the data. PPP or any other packet protocol isn't going to change the basic fact that LZJH is a superior algorithm and if you knew the differences between LZJH and LZW you would readily understand why. I am going to update the draft per James Carlsons comments and we can go from there. Jeff Vernon Schryver <vjs@xxxxxxxxxxxxxxxxxxxx> 01/15/2003 04:25 PM To: jheath@xxxxxxx cc: ietf-ppp@xxxxxxxxx, james.d.carlson@xxxxxxxxxxxx, karlfox@xxxxxxxxxxxxxxx, narten@xxxxxxxxxx, owner-ietf-ppp@xxxxxxxxx Subject: Re: draft-heath-ppp-v44-02.txt > From: jheath@xxxxxxx > ... > I am talking about apples to apples comparisons over a packet network > environment where a stream of IP sized 1500 byte packets is compressed > between two peers where: > --- typical internet data is used, such as mail, HTML, word, etc. files. > >From actual mail folders and web HTML's, etc. > --- comparable memory usage for each algorithm. Thus a LZS 2048 history > is compared to LZW (V.42bis) 2048 dictionary (11 bits) and LZJH (V.44) > 2048 dictionary (11 bits) such that with tables, etc. the memory usage is > roughly equal +/- 20%. > > In that environment LZS and LZJH both get better compression than LZW (in > all my testing over the past 3 years) and LZJH gets better than LZS. LZS > supports only a 2048 history so I guess if you compare it to LZW with a 15 > bit word size (32K dictionary) then it would do better than LZS. I have > also tested LZW and LZJH against a LZS clone that supports various history > sizes and the results were similiar using comparable history and > dictionary sizes. I still wonder about the PPP implementations you tested, because I'm not sure you are comparing apples (PPP compression) to apples (PPP compression) instead of grapes (bulk compression such as LZS) to watermelons (modem v.42bis). I do not recall hearing of a PPP implementation that used v.42bis CCP propsoal. For that matter, I don't recall that the v.42vis CCP proposal ever made it to an RFC. Yes, in theory it is possible to make comparisons of PPP compression outside complete implementations, but historically most such comparisons have been at best based on obviously false and grossly misleading assumptions. A typical bogus assumption is to assume that there are no TCP/IP headers among the data. > Using the above I have compared LZJH with LZW for all dictionary sizes up > through 13 bits and LZJH is clearly a superior algorithm. While a 200% > improvement with LZJH certainly is not normal and is only with some highly > compressible files, almost all HTML will provide 60% - 100% improvement. Please be more specific about the 200% difference between LZJH and LZW. Frankly, I do not believe that is possible in an honest, apples-to-apples comparison. I can believe no-history LZJH might do 200% better than 9 or 10-bit no-history LZW (clear on every packet), but no one would implement such a silly thing. > In an environment where the dictionary is cleared after every IP packet > LZW is very inferior to LZJH and LZS. Yes, but who cares about contrived and irrelevant tests? An apples-to-apples comparison must be among RFC 1977, RFC 1967, and something like your proposal. > I agree that there are plenty of claims of compression performance all > over the place. That is why you have to bound MIP's and memory to get an > apples to apples comparison. You can trash the ITU if you please but > they were at least forward looking enough to: > --- require competing compression algorithms to get at least 20% better > compression than V.42bis over a set of typical internet files while using > no more than 20% more MIP's and no more than 20% more memory than V.42bis > (i.e. the new algorithm had to run on existing hardware). Tests were run > using dictionary sizes from 512 through 8192. I hoped to make clear that I don't think the ITU is significantly vulnerable to smoke, mirrors, and marketing baloney than any other standards organization including the IETF. It is good to do tests, but they must be as you say, apples-to-apples. Comparing v.42bis to your CCP-LZJH implies nothing about how your CCP-LZJH compares to any IETF PPP compression protocol that I can see. > --- recognize that LZJH was a better compression solution and adopt it as > V.44 despite the fact that virtually 100% of the worlds modems at that > time had V.42bis. They felt that as new modems were built V.44 would be > phased in. The fact that V.42bis was the current de facto choice did not > stop the ITU from adopting something better. That is a point in favor of the ITU only if you assume your conclusion, that LZJH is better than v.42bis. I'm inclined to assume that LZJH is better than v.42bis, but that's not relevant to anything I see. > --- recognize that the Calgary and Canterbury corpus were not > representative of data being downloaded in todays internet and create a > set of test files that are representative. If you look at the archives for this mailing list, you'll find that those were not the only sets of test data consider or even the most influental. > --- run comparison tests against HTML files from web sites worldwide with > many different languages. That strikes me as a distinctly minor and probably bogus consideration for the test sets. Picking the test data is not easy, but it is quite wrong to involve political correctness. > I would assume that the IETF is forward looking enough want to allow PPP > implementors the choice if something better comes along regardless of the > number of current CCP options and regardless of a current de facto choice. > Most will run their own comparisons and decide for themselves which is > best based on MIP's, memory, or other contraints which they may have. That is mistaken. In the network world outside the ISO or ITU dream world, considerations of MIPs, memory, and so forth are minor compared to other issues starting with interoperation. (Yes, *this* is a slap at the standard old ISO OSI/ITU delusions of grandure that what a standards committee considers important and desirable must matter outside hotel meeting rooms.) > I would be happy to provide you with the source of LZJH if you want to run > your own comparisons. Thanks, but I trust that if you did apples-to-apples comparison, you would report the results accurately. > Yes, it is going to be an Informational RFC. That short-circuits most of my concerns. Vernon Schryver vjs@xxxxxxxxxxxx
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: draft-heath-ppp-v44-02.txt, Vernon Schryver |
|---|---|
| Next by Date: | I-D ACTION:draft-ietf-pppext-rfc2284bis-09.txt, Internet-Drafts |
| Previous by Thread: | Re: draft-heath-ppp-v44-02.txt, Vernon Schryver |
| Next by Thread: | PWG-ANNOUNCE> Preliminary Details: March 31-April 4, 2003 PWG Meetings Washington DC, a . s . patel |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |