logo       

Re: Disk cache: msg#00010

network.bit-torrent.libtorrent

Subject: Re: Disk cache

On Apr 10, 2006, at 20:53, Radu Hociung wrote:

Hello Arvid, and all,

I think caching at the application level can be far more efficient than
at the O/S level, as the application typically can manage
characteristics of the dataflow, whereas the O/S cannot. Please see below.

Yes, this is true in general, my concern is that the characteristics of the dataflow in the bittorrent case is very close to completely random. Talking about the request pattern, the only exception to randomness is the rarest first-algorithm. But this algorithm will also make sure that the distribution among the peers is as uniform as possible at all time, keeping the request pattern as random as possible.

It is true though, that in the case where the swarm has tendencies of being "serialized", it is very common to request pieces which were just made available from a peer. But I'm not sure that is a very common case. Obviously some data is needed here.

I will use "cheap data" and "expensive data" to refer to in-buffer and
on-disk data respectively.

The theory of Nash equilibrium applies very much to torrents, I think.
Downloading torrents is a non-cooperative game, in that each client is
concerned with ensuring only its own success, and the strategy it must
employ is of maximizing the chance for success while minimizing its cost
of achieving success.

true.

Please see below for my comments on caching.

Arvid Norberg wrote:
There has been some discussion on the IRC channel about benefits and
possible cache algorithms for a disk cache in libtorrent. I'm posting
this hoping that somebody could comment on some of the techniques or
assumptions.

The way I see it, there are two problems that a cache aims to solve:

1. seeding on a high speed connection will make the hard drive seek a
lot back and forth, just to do short reads.

2. downloading on a high speed connection will make the hard drive seek
a lot back and forth.


For 1, one assumption that can be made is that if block 0 is requested
from a piece, it's very likely that many more blocks from that piece
will be requested soon. So caching the entire piece at once would
probably be a good idea.

This is exactly what a normal OS' disk cache does. It reads ahead into
the cache. So I don't really see any benefit if caching this at
application level. One possible optimization would be if the disk cache
could be bypassed so that only one piece was cached (since the disk
cache in the OS may cache more than the piece we're reading from). This
would be hard to do, and especially hard to do in a platform
independent manner.

Is this a reasonable assumption?

If the application is able to fill the available outgoing bandwidth with
cheap data, there is no benefit to reading more/other data from the
disk, is there?

I think it would be a good idea for the library to prefer to respond to
requests for cheap data, and postpone responses to requests for
expensive data until the utilization of the available outgoing bandwidth
drops. Clients that request expensive data should be choked until cheap
data can no longer fill the outgoing pipe, and it becomes necessary to
read expensive data to fill the pipe.

The client isn't completely free of choosing who to upload to/whose requests to respond do. In order to maximize download speed, it has to upload to those that it can download from. I also think that the most common case is that the output pipe very seldom is saturated. So, I'm not sure weighting peers depending on the pieces they request would work very well, it would affect the behavior of the client in other ways.

Your distinction of cheap and expensive data doesn't take seek distance into account. I think that may be the most important thing to concentrate on, at least when it comes to writing.

Number 2 is a little bit more interesting though. Since every block
that is downloaded has to be written to disk, and since every block is
only downloaded once, you cannot really optimize away disk writes. What
you can do though, is to optimize the disk seeking.

Absolutely. Also, since while downloading, the client also uploads, it
would be more disk efficient to only seek for disk writes, but never for
disk reads. Ie, the client should prefer to respond to requests for data
is has freshly downloaded, instead of reading other data from disk.
Assuming that the client downloads least-common data first, it's a
reasonable assumtion that the data it holds in memory buffers is still
rare to other clients.

Yes, at least to a certain degree. One thing to keep in mind though, is that the number of pieces in a torrent is generally much more than the number of peers that a client can see. How rare a piece is is quantized to the number of peers that has it. This means that the number of equally rare pieces is usually very high, far higher than a write cache would be.

In this case, the library should write to disk when downloaded pieces
are complete, but do not free the buffers until other clients have
downloaded them enough to cause the uplink utilization to drop.

I have thought about optimizing the dataflow a lot. I believe there are
at least two good reasons for improving caching of torrents:

1. Fewer seeks == less noise == drives lasting longer.
2. Fewer seeks == drive is more responsive to other apps that run
concurently.

Definitely, I think the disk perfomance is very important.

I might add that there's a huge difference for me (OSX 10.4) when downloading in full allocation mode and compact allocation mode. The compact allocation mode will have the effect that most reads and writes are kept relatively close to each other. They will spread out over time of course though. But just downloading with client_test will saturate my connection at slightly above 1 MB/s if I'm in compact mode. In full allocation mode though, I will not get above about 600 kB/s, and my drive is really making a noise and slowing down the computer to be virtually useless.

I believe the on-drive cache is better than nothing, O/S cache is better
than the on-drive cache, and application cache is better than O/S cache.
The closer the cache is to the data generation/consumption algorithm,
the more information is available for optimizing it.

I think this approach to caching gives an answer to multiple-torrent
caching as well.

What do you mean? Do you think there would be a point of having a shared cache for all torrents?

One benefit would be that if you have 100 torrents but only 2 of them are active, they will get all the cache for themselves.

One more thing:

I have considered working on a test-bench that is able to run a swarm.
The testbench would be able to report disk-access efficiency, by
measuring some metrics, like how far apart the written pieces are in
memory (ie, seek throw), total number of accesses, etc. The idea is that
it can be calculated empirically what the minimum number of accesses,
and long seeks is to fully download a file, while saturating the output
link. The metrics given by the testbench could be compared to the ideal,
and a measure of client efficiency would thus be obtained. (Naturally,
min-long-seeks = 0, and min-accesses = file-size/buffer-size)

The test-bench would hook O/S calls in a manner similar to strace, and
would be able to run multiple makes/brands/versions of clients, such
that swarm behaviour can be observed. This would be very useful to study
potential protocol changes and their effect on older-version clients.

That sounds like it could be very useful!

I am very interested in your thoughts and comments on caching,

Ditto :)

So, I think my first attempt will be to cache pieces, and only write whole pieces instead of blocks to disk. To see if the performance is improved. I don't know how to measure it though, maybe I could try saturating my download link in full allocation mode and see if it increases.

Do you have any other suggestions that are reasonably easy to try out?

--
Arvid Norberg




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise