logo       

Re: Disk cache: msg#00018

network.bit-torrent.libtorrent

Subject: Re: Disk cache

Arvid Norberg wrote:
>
> On Apr 14, 2006, at 06:27, Radu Hociung wrote:
>
>> Arvid Norberg wrote:
>>>
>>> On Apr 14, 2006, at 01:53, Radu Hociung wrote:
>>>
>>>> Greetings all,
>>>>
>>>> I have done some experiments on libtorrent and Transmission, comparing
>>>> the effectiveness of their client-side caching and download scheduling
>>>> algorithms.
>>>>
>>>> I put the results (with graphs) on Wikipedia at:
>>>>
>>>> http://en.wikipedia.org/wiki/BitTorrent_performance
>>>>
>>>> I hope that you'll find it a worthwhile read.
>>>
>>> Very nice graphs :)
>>>
>>> Caching blocks and only write whole pieces at once would definitely
>>> improve that measurement. But I'm not sure it would improve in reality
>>> (it probably would, a bit at least), since the OS's cache very likely is
>>> done in the kernel, which means that you don't know how many of those
>>> write-calls actually resulting in a write to the disk.
>>
>> I doubt that writing whole pieces would make much difference. Looking at
>> libtorrent's current download pattern, and at the strace, it looks like
>> it writes the blocks of a piece in the proper order. So they probably
>> get assembled into the respective pieces in the OS's cache. If they were
>> being downloaded out of order, I would see a case for caching whole
>> pieces.
>
> I thought you downloaded a "wild" torrent, with more than one peer. If
> you had done that, you would have noticed that the write operations
> would have been scattered more. Because libtorrent is writing each block
> to disk, and even though blocks within a piece are downloaded
> sequentially, it would still be more seeking.
>
> In high speeds libtorrent prefers to download whole pieces from single
> peers, which means block writes to different pieces would be interleaved.

Ah, of course. I should have thought of this.

In order to answer questions regarding the behaviour of OS caches, there
is a library available to study physical disk access, written at the
Renaissance Computing Institute. It requires a patch to the kernel, as
well as some calls in the user-level application to start/stop tracing.

If the user-level calls were put in strace, then physical analysis of an
application would be easier to apply to any client, without modifying
the client itself.

Here's the URL to the (open source) library:

http://www.renci.org/research/cadre/CADRETraceLibraries/CADREPhysicalIOLibrary.htm

The testbench sw I am using is essentially strace, with modifications I
made. I plan to release this testbench so that you can run it yourselves
also, which I think will allow you to easily experiment and draw
conclusions, without requiring guessing.


>> However, you're right, about not knowing when the writes to disk
>> actually occur:
>>
>> Say that the kernel starts off with an empty cache and 100MB free RAM.
>> When downloading a 5GB file, after the 1st 100MB are received, the OS
>> will have to start writing. It's reasonable that it will start flushing
>> the oldest cache slots first, so it is likely that after the fist 100MB
>> received, it will start writing the data it received first.
>>
>> In the case of some advanced filesystems (such as XFS, which buffers
>> data as long as possible and allocates storage at the last possible
>> moment, when flushing buffers becomes innevitable), it is possible that
>> the cache won't always flush the first-received data, but it may try to
>> group buffers based on their relative closeness.
>
> Maybe it's possible to get timestamps in the trace log. Then it would be
> possible to estimate when actual writes were done.

It would be very easy to add timestamps, you can use the same command
line as I show in wikipedia, but add "-tt" (timestamp) and/or -T (call
duration) to the strace options.

You can this way profile any torrent.


>> But in any case, when the file size is much larger than the available
>> cache memory, it is inevitable that the disk heads will jump around.
>>
>> In my test setup, I only had 1 seed and 1 client, so to the client, all
>> the pieces it didn't already have were equally rare in the 'swarm', and
>> the random download pattern really did not have any purpose.
>>
>>
>> I was thinking of another way of scheduling downloads. With 100MB of
>> cache, the client could "focus" on downloading 100MB chunks of the
>> torrent, and select the rarest first within that 100MB chunk before
>> moving on to another chunk. the chunk could be a sliding window or such.
>> Before moving the sliding window, the client could do a 'sync' to force
>> data to be flushed. I would assume that on sync, the data would be
>> written in the most contiguous way possible.
>>
>> I plan on experimenting with the scheduling routines of either
>> libtorrent or transmission to test out the various hypotheses.
>>
>> The more I play around, the more I convince myself that it's the
>> scheduling algorithm that can bring the most benefits, not the caching
>
> I'm a bit reluctant to change the random distribution + rarest first
> algorithm, because it is much more efficient in keeping a torrent alive
> than if chunks would be downloaded. (with alive I mean that the
> distributed copies >= 1).

I understand, but in some cases the random distribution does not do
anything. In the case I show of 1:1, it's no safer to download data out
of order than it is to download sequentially I think.

Also, in the case of a perfect torrent (more seeds than downloaders, say
20:5), say that two pieces have availability of 20 and 24 respectively.
Does it really make a difference if you download the more available
piece first? Is it likely that the 20 seeds will leave the torrent,
rendering it dead? The client should constantly evaluate risk, and
switch to "survival" mode only when necessary, but it's overkill to
always run in "survival" mode.


>> algorithm. After all, in a case where most pieces are written to the
>> disk once, and never read back, the cache acts as little more than a
>> simple delayed-write buffer, whether it is an application cache or an OS
>> cache.
>
> That is why I haven't made any real effort in implementing a cache in
> libtorrent so far.
>
>> BTW: Why are there 2 duplicate seeks before every write and 3 before
>> reads? Here's an example:
>>
>> 21561 write(6, /* head at 4161536 effectiveness 61.077 */0x8083e31,
>> 16384) = 16384
>> 21561 _llseek(6, 4177920, [4177920], /* BT efficiency 61.077% */
>> SEEK_SET) = 0
>> 21561 _llseek(6, 0, [4177920], /* BT efficiency 61.077% */ SEEK_CUR) = 0
>> 21561 write(6, /* head at 4177920 effectiveness 61.111 */0x8083e31,
>> 16384) = 16384
>> 21561 _llseek(6, 3145728, [3145728], /* BT efficiency 61.111% */
>> SEEK_SET) = 0
>> 21561 _llseek(6, 0, [3145728], /* BT efficiency 61.111% */ SEEK_CUR) = 0
>> 21561 _llseek(6, 0, [3145728], /* BT efficiency 61.111% */ SEEK_CUR) = 0
>> 21561 read(6, /* head at 4194304 effectiveness 60.000 */0x404f4008,
>> 1048576) = 1048576
>> 21561 _llseek(6, 4194304, [4194304], /* BT efficiency 60.000% */
>> SEEK_SET) = 0
>> 21561 _llseek(6, 0, [4194304], /* BT efficiency 60.000% */ SEEK_CUR) = 0
>> 21561 write(6, /* head at 4194304 effectiveness 61.905 */0x404f4008,
>> 1048576) = 1048576
>
> Does this happen only with libtorrent, or with transmission too? If it's
> in both, I would guess that it's an implementation detail of the POSIX
> layer, that the POSIX calls don't translate directly to system calls.
> Since libtorrent has a pool of open files, each read or write operation
> has to seek before reading/writing, because there may have been other
> such operations since last time, which moved the file position. This
> means that even when reading sequentially, libtorrent will seek to the
> current position because it doesn't know that the last read/write
> operation left the file position at the place where it wants it for the
> next operation.
>
> I can't imagine that there's any cost involved in doing that extra seek,
> the alternative would be to call tell() and the compare, and then seek.
> But the common case is that that seek is necessary anyway, it's just
> that in your case you only had one peer.


This only happens in libtorrent; transmission seeks exactly once for
both reads and writes.

So if you do the SEEK_CUR in order to ensure the cursor is where you
expect it, it means you don't have an atomic lock between the SEEK_SET
and the read/write? If you don't have such a lock and your app is
multi-threaded, what prevents another thread from seeking between your
SEEK_CUR and the read/write?

Radu.



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise