|
Re: Disk cache: msg#00030network.bit-torrent.libtorrent
Arvid Norberg wrote: > > On Apr 19, 2006, at 23:36, Radu Hociung wrote: > >> Arvid Norberg wrote: >> >>> >>> On Apr 14, 2006, at 17:05, Radu Hociung wrote: >>> >>>> Arvid Norberg wrote: >>>> >>>>> >>>>> On Apr 14, 2006, at 06:27, Radu Hociung wrote: >>>>> >>>>>> I doubt that writing whole pieces would make much difference. >>>>>> Looking at >>>>>> libtorrent's current download pattern, and at the strace, it looks >>>>>> like >>>>>> it writes the blocks of a piece in the proper order. So they >>>>>> probably >>>>>> get assembled into the respective pieces in the OS's cache. If they >>>>>> were >>>>>> being downloaded out of order, I would see a case for caching whole >>>>>> pieces. >>>>> >>>>> >>>>> >>>>> I thought you downloaded a "wild" torrent, with more than one >>>>> peer. If >>>>> you had done that, you would have noticed that the write operations >>>>> would have been scattered more. Because libtorrent is writing each >>>>> block >>>>> to disk, and even though blocks within a piece are downloaded >>>>> sequentially, it would still be more seeking. >>>>> >>>>> In high speeds libtorrent prefers to download whole pieces from >>>>> single >>>>> peers, which means block writes to different pieces would be >>>>> interleaved. >>>> >>>> >>>> >>>> Ah, of course. I should have thought of this. >>>> >>>> In order to answer questions regarding the behaviour of OS caches, >>>> there >>>> is a library available to study physical disk access, written at the >>>> Renaissance Computing Institute. It requires a patch to the kernel, as >>>> well as some calls in the user-level application to start/stop >>>> tracing. >>>> >>>> If the user-level calls were put in strace, then physical analysis >>>> of an >>>> application would be easier to apply to any client, without modifying >>>> the client itself. >>>> >>>> Here's the URL to the (open source) library: >>>> >>>> http://www.renci.org/research/cadre/CADRETraceLibraries/ >>>> CADREPhysicalIOLibrary.htm >>>> >>>> The testbench sw I am using is essentially strace, with >>>> modifications I >>>> made. I plan to release this testbench so that you can run it >>>> yourselves >>>> also, which I think will allow you to easily experiment and draw >>>> conclusions, without requiring guessing. >>> >>> >>> >>> Do you think it would be possible to use it on a clean strace log >>> (without the modifications?). I'm asking because in that case maybe it >>> would be possible to use it on a ktrace log (which is used on darwin). >> >> >> The CADRE library is not used on a log. Instead, it must be >> activated/deactivated somehow. The information it captures when >> activated is stored in a separate location, not in the strace log. >> >> Further, in order to use it, a patch must be applied to a linux kernel. >> This patch adds instrumentation to the device drivers. It's that >> instrumentation that is then started/stopped by making a call from the >> user level application. >> >> So, I don't think this library will run on darwin; I am not familiar >> with ktrace, but it looks to be an strace-like utility. Does the "-ti" >> option not log interractions of the kernel with the hardware? If it >> does, then you don't need CADRE in order to monitor the behaviour of the >> disk cache. > > > I was primarily thinking about your testbench. Perhaps I don't understand what you're asking. What would you like to use on a clean strace log? The only thing that my testbench does currently is to add a comment on every _lseek, read and write line showing cumulative efficiency on that file descriptor, along with the position of the drive heads at the time the read/write call was made. These two pieces of information are only used to plot the fancy graphs I've shown. You can certainly derive the same plots from a clean strace log, which contains all the necessary information. >>> Not in the case where you're guaranteed that the seed will stay >>> online, >>> but it's very uncommon to have such a guarantee. Usually, the seed >>> could go offline and a half-finished peer will come online. The >>> optimal >>> case would be that both peers would have 1 distributed copy and be >>> able >>> to complete the file. And closest one can get to that (without having >>> any global knowledge) is to download pieces in random. >> >> >> Ah, I see... I had not thought of such a scenario. Very interesting. >> >> So, in a torrent like you described (1 seed, 1 hidden/offline peer, 1 >> blind peer), assuming downloaders get 1/2 of the file each, they would >> be able to finish only if they have complementary halves of the file. >> >> With a randomization at the piece level, with say, 1,000 pieces in >> total, each of the downloaders need the right 500 pieces to make the >> torrent healthy. >> >> The probability for the blind client downloading the right 500 pieces, >> not knowing which 500 the hidden peer has, are about 1/(2^1000), since >> each piece has 1/2 chance of being the right piece, and that there are >> 1000 guesses to be made. >> >> I still think sequential downloading gives the torrent a better chance >> for healing: >> >> Say that each client downloads quarters of the file in random order. The >> pieces in each quarter are downloaded sequentially. (ie, all the pieces >> in quarter 1 are downloaded in sequential order, but the file may be >> downloaded as Q2,Q4,Q3,Q1) >> >> In this case, if the torrent behaves like you described above, the >> chances for the torrent remaining healthy with the blind and hidden >> peers present, but seed absent, are about 1/(2^4), since there are 4 >> guesses to be made. If each of the clients has 1/2 of the file when the >> seed goes away, they have a 1/16 chance of finishing. >> >> So comparing a download strategy of piece-by-piece-randomly (PBPR) or >> quarter-by-quarter-randomly (QBQR), the QBQR strategy has a 6.25% chance >> of finishing, while PBPR has no (1/(2^1000)) chance of finishing. >> >> I think even the 6.25% probability can be improved: >> >> If the file is downloaded in halves instead, there is a 25% chance of >> finishing this download. Perhaps each client can decide whether to >> download halves, quarters, eights or whatever, depending on how many >> peers it encounters: >> >> 1 seed, 0 peers = download halves >> 1 seed, 1 peer = download quarters >> 2 seeds, 2 peers = download eights >> >> What do you think? > > > I would most respectfully encourage you to think again :) > > To be honest, I'm not sure I follow all of your reasoning. But what you > do in practice is that you increase the piece sizes. This has the > effect of increasing overlap in the downloads. the chances of finishing > is not the only measurement, you have to consider the overlap in the > downloads. The goal is that any two peers should have as high > probability as possible to exchange data, two-way. The higher > granularity (number of pieces) the closer one would get to the optimal > case, when downloading random pieces. > > Also keep in mind that the peer has no global knowledge, so the number > of peers and seeds would have to be counted only among neighbours. > > You can probably get a very good explanation why this is the case in > either #bittorrent on freenode or on the bittorrent mailing list: > > http://lists.ibiblio.org/mailman/listinfo/bittorrent I will do more reading, but what I meant was for the "super-pieces" to not overlap. with a 1000-piece content, the first quarter would always be pieces 0-249, the second would be 250-499, and so on. Thanks for the pointers, I'll read the recommended info. Cheers, Radu. ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Disk cache: 00030, Arvid Norberg |
|---|---|
| Next by Date: | Re: Disk cache: 00030, Radu Hociung |
| Previous by Thread: | Re: Disk cachei: 00030, MooPolice |
| Next by Thread: | Re: Disk cache: 00030, Arvid Norberg |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |