logo       

Re: Disk cache: msg#00017

network.bit-torrent.libtorrent

Subject: Re: Disk cache


On Apr 14, 2006, at 06:27, Radu Hociung wrote:

Arvid Norberg wrote:

On Apr 14, 2006, at 01:53, Radu Hociung wrote:

Greetings all,

I have done some experiments on libtorrent and Transmission, comparing
the effectiveness of their client-side caching and download scheduling
algorithms.

I put the results (with graphs) on Wikipedia at:

http://en.wikipedia.org/wiki/BitTorrent_performance

I hope that you'll find it a worthwhile read.

Very nice graphs :)

Caching blocks and only write whole pieces at once would definitely
improve that measurement. But I'm not sure it would improve in reality
(it probably would, a bit at least), since the OS's cache very likely is
done in the kernel, which means that you don't know how many of those
write-calls actually resulting in a write to the disk.

I doubt that writing whole pieces would make much difference. Looking at
libtorrent's current download pattern, and at the strace, it looks like
it writes the blocks of a piece in the proper order. So they probably
get assembled into the respective pieces in the OS's cache. If they were
being downloaded out of order, I would see a case for caching whole pieces.

I thought you downloaded a "wild" torrent, with more than one peer. If you had done that, you would have noticed that the write operations would have been scattered more. Because libtorrent is writing each block to disk, and even though blocks within a piece are downloaded sequentially, it would still be more seeking.

In high speeds libtorrent prefers to download whole pieces from single peers, which means block writes to different pieces would be interleaved.

However, you're right, about not knowing when the writes to disk
actually occur:

Say that the kernel starts off with an empty cache and 100MB free RAM.
When downloading a 5GB file, after the 1st 100MB are received, the OS
will have to start writing. It's reasonable that it will start flushing
the oldest cache slots first, so it is likely that after the fist 100MB
received, it will start writing the data it received first.

In the case of some advanced filesystems (such as XFS, which buffers
data as long as possible and allocates storage at the last possible
moment, when flushing buffers becomes innevitable), it is possible that
the cache won't always flush the first-received data, but it may try to
group buffers based on their relative closeness.

Maybe it's possible to get timestamps in the trace log. Then it would be possible to estimate when actual writes were done.

But in any case, when the file size is much larger than the available
cache memory, it is inevitable that the disk heads will jump around.

In my test setup, I only had 1 seed and 1 client, so to the client, all
the pieces it didn't already have were equally rare in the 'swarm', and
the random download pattern really did not have any purpose.


I was thinking of another way of scheduling downloads. With 100MB of
cache, the client could "focus" on downloading 100MB chunks of the
torrent, and select the rarest first within that 100MB chunk before
moving on to another chunk. the chunk could be a sliding window or such.
Before moving the sliding window, the client could do a 'sync' to force
data to be flushed. I would assume that on sync, the data would be
written in the most contiguous way possible.

I plan on experimenting with the scheduling routines of either
libtorrent or transmission to test out the various hypotheses.

The more I play around, the more I convince myself that it's the
scheduling algorithm that can bring the most benefits, not the caching

I'm a bit reluctant to change the random distribution + rarest first algorithm, because it is much more efficient in keeping a torrent alive than if chunks would be downloaded. (with alive I mean that the distributed copies >= 1).

algorithm. After all, in a case where most pieces are written to the
disk once, and never read back, the cache acts as little more than a
simple delayed-write buffer, whether it is an application cache or an OS
cache.

That is why I haven't made any real effort in implementing a cache in libtorrent so far.

BTW: Why are there 2 duplicate seeks before every write and 3 before
reads? Here's an example:

21561 write(6, /* head at 4161536 effectiveness 61.077 */0x8083e31,
16384) = 16384
21561 _llseek(6, 4177920, [4177920], /* BT efficiency 61.077% */
SEEK_SET) = 0
21561 _llseek(6, 0, [4177920], /* BT efficiency 61.077% */ SEEK_CUR) = 0
21561 write(6, /* head at 4177920 effectiveness 61.111 */0x8083e31,
16384) = 16384
21561 _llseek(6, 3145728, [3145728], /* BT efficiency 61.111% */
SEEK_SET) = 0
21561 _llseek(6, 0, [3145728], /* BT efficiency 61.111% */ SEEK_CUR) = 0
21561 _llseek(6, 0, [3145728], /* BT efficiency 61.111% */ SEEK_CUR) = 0
21561 read(6, /* head at 4194304 effectiveness 60.000 */0x404f4008,
1048576) = 1048576
21561 _llseek(6, 4194304, [4194304], /* BT efficiency 60.000% */
SEEK_SET) = 0
21561 _llseek(6, 0, [4194304], /* BT efficiency 60.000% */ SEEK_CUR) = 0
21561 write(6, /* head at 4194304 effectiveness 61.905 */0x404f4008,
1048576) = 1048576

Does this happen only with libtorrent, or with transmission too? If it's in both, I would guess that it's an implementation detail of the POSIX layer, that the POSIX calls don't translate directly to system calls. Since libtorrent has a pool of open files, each read or write operation has to seek before reading/writing, because there may have been other such operations since last time, which moved the file position. This means that even when reading sequentially, libtorrent will seek to the current position because it doesn't know that the last read/write operation left the file position at the place where it wants it for the next operation.

I can't imagine that there's any cost involved in doing that extra seek, the alternative would be to call tell() and the compare, and then seek. But the common case is that that seek is necessary anyway, it's just that in your case you only had one peer.


--
Arvid Norberg




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise