|
|
Re: Disk cache: msg#00017
network.bit-torrent.libtorrent
On Apr 14, 2006, at 06:27, Radu Hociung wrote:
Arvid Norberg wrote:
On Apr 14, 2006, at 01:53, Radu Hociung wrote:
Greetings all,
I have done some experiments on libtorrent and Transmission,
comparing
the effectiveness of their client-side caching and download
scheduling
algorithms.
I put the results (with graphs) on Wikipedia at:
http://en.wikipedia.org/wiki/BitTorrent_performance
I hope that you'll find it a worthwhile read.
Very nice graphs :)
Caching blocks and only write whole pieces at once would definitely
improve that measurement. But I'm not sure it would improve in
reality
(it probably would, a bit at least), since the OS's cache very
likely is
done in the kernel, which means that you don't know how many of those
write-calls actually resulting in a write to the disk.
I doubt that writing whole pieces would make much difference.
Looking at
libtorrent's current download pattern, and at the strace, it looks
like
it writes the blocks of a piece in the proper order. So they probably
get assembled into the respective pieces in the OS's cache. If they
were
being downloaded out of order, I would see a case for caching whole
pieces.
I thought you downloaded a "wild" torrent, with more than one peer.
If you had done that, you would have noticed that the write
operations would have been scattered more. Because libtorrent is
writing each block to disk, and even though blocks within a piece are
downloaded sequentially, it would still be more seeking.
In high speeds libtorrent prefers to download whole pieces from
single peers, which means block writes to different pieces would be
interleaved.
However, you're right, about not knowing when the writes to disk
actually occur:
Say that the kernel starts off with an empty cache and 100MB free RAM.
When downloading a 5GB file, after the 1st 100MB are received, the OS
will have to start writing. It's reasonable that it will start
flushing
the oldest cache slots first, so it is likely that after the fist
100MB
received, it will start writing the data it received first.
In the case of some advanced filesystems (such as XFS, which buffers
data as long as possible and allocates storage at the last possible
moment, when flushing buffers becomes innevitable), it is possible
that
the cache won't always flush the first-received data, but it may
try to
group buffers based on their relative closeness.
Maybe it's possible to get timestamps in the trace log. Then it would
be possible to estimate when actual writes were done.
But in any case, when the file size is much larger than the available
cache memory, it is inevitable that the disk heads will jump around.
In my test setup, I only had 1 seed and 1 client, so to the client,
all
the pieces it didn't already have were equally rare in the 'swarm',
and
the random download pattern really did not have any purpose.
I was thinking of another way of scheduling downloads. With 100MB of
cache, the client could "focus" on downloading 100MB chunks of the
torrent, and select the rarest first within that 100MB chunk before
moving on to another chunk. the chunk could be a sliding window or
such.
Before moving the sliding window, the client could do a 'sync' to
force
data to be flushed. I would assume that on sync, the data would be
written in the most contiguous way possible.
I plan on experimenting with the scheduling routines of either
libtorrent or transmission to test out the various hypotheses.
The more I play around, the more I convince myself that it's the
scheduling algorithm that can bring the most benefits, not the caching
I'm a bit reluctant to change the random distribution + rarest first
algorithm, because it is much more efficient in keeping a torrent
alive than if chunks would be downloaded. (with alive I mean that the
distributed copies >= 1).
algorithm. After all, in a case where most pieces are written to the
disk once, and never read back, the cache acts as little more than a
simple delayed-write buffer, whether it is an application cache or
an OS
cache.
That is why I haven't made any real effort in implementing a cache in
libtorrent so far.
BTW: Why are there 2 duplicate seeks before every write and 3 before
reads? Here's an example:
21561 write(6, /* head at 4161536 effectiveness 61.077 */0x8083e31,
16384) = 16384
21561 _llseek(6, 4177920, [4177920], /* BT efficiency 61.077% */
SEEK_SET) = 0
21561 _llseek(6, 0, [4177920], /* BT efficiency 61.077% */
SEEK_CUR) = 0
21561 write(6, /* head at 4177920 effectiveness 61.111 */0x8083e31,
16384) = 16384
21561 _llseek(6, 3145728, [3145728], /* BT efficiency 61.111% */
SEEK_SET) = 0
21561 _llseek(6, 0, [3145728], /* BT efficiency 61.111% */
SEEK_CUR) = 0
21561 _llseek(6, 0, [3145728], /* BT efficiency 61.111% */
SEEK_CUR) = 0
21561 read(6, /* head at 4194304 effectiveness 60.000 */0x404f4008,
1048576) = 1048576
21561 _llseek(6, 4194304, [4194304], /* BT efficiency 60.000% */
SEEK_SET) = 0
21561 _llseek(6, 0, [4194304], /* BT efficiency 60.000% */
SEEK_CUR) = 0
21561 write(6, /* head at 4194304 effectiveness 61.905 */0x404f4008,
1048576) = 1048576
Does this happen only with libtorrent, or with transmission too? If
it's in both, I would guess that it's an implementation detail of the
POSIX layer, that the POSIX calls don't translate directly to system
calls. Since libtorrent has a pool of open files, each read or write
operation has to seek before reading/writing, because there may have
been other such operations since last time, which moved the file
position. This means that even when reading sequentially, libtorrent
will seek to the current position because it doesn't know that the
last read/write operation left the file position at the place where
it wants it for the next operation.
I can't imagine that there's any cost involved in doing that extra
seek, the alternative would be to call tell() and the compare, and
then seek. But the common case is that that seek is necessary anyway,
it's just that in your case you only had one peer.
--
Arvid Norberg
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
|
|