logo       

Re: Csum and csum copyroutines benchmark: msg#00518

Subject: Re: Csum and csum copyroutines benchmark
On Fri, 2002-10-25 at 14:59, Denis Vlasenko wrote:
> Well, that makes it run entirely in L0 cache. This is unrealistic
> for actual use. movntq is x3 faster when you hit RAM instead of L0.
> 
> You need to be more clever than that - generate pseudo-random
> offsets in large buffer and run on ~1K pieces of that buffer.

In a lot of cases its extremely realistic to assume the network buffers
are in cache. The copy/csum path is often touching just generated data,
or data we just accessed via read(). The csum RX path from a card with
DMA is probably somewhat different.





<Prev in Thread] Current Thread [Next in Thread>