|
|
Choosing A Webhost: |
RE: Freetds\sql server lag: msg#00272db.tds.freetds
Il gio, 2003-11-27 alle 03:09, Craig Jackson ha scritto: > >> From: ZIGLIO Frediano [mailto:Frediano.Ziglio@xxxxxxxxxxxx] > >> Sent: November 25, 2003 8:52 AM > >> > >> > It's a very interesting option! However I don't know all > >> > disavantage it can take. Anyone know a way to "flush" socket > >> > correctly? By the way... attached a patch to enable TCP_NODELAY. > >> > > >> http://www.unixguide.net/network/socketfaq/2.11.shtml > >> http://freebooks.by.ru/view/SambaIn24h/ch23.htm > >> > >> Perhaps it's really a good idea... however I still don't > >> understand why there isn't a flush call for socket :( > > >AIUI there's no flush call for a socket because the receiver isn't passive. > >A disk is always prepared to receive data, but a socket peer may not be. > >The best you can do is post your data, and let the network do its job. > > >I'd like to understand better what's going on. Craig Jackson, can you help > >us out here? > > I'll give it a try. I can't claim I'm an expert. > > >What puzzles me: TCP_NODELAY involves flushing small packets, instead of > >bundling them together. Examples given are mouse data or vi sessions. But > >TDS doesn't involve small bits of data. Even a small query has a header and > >its TDS packet. > > >It is possible that a query may not quite fit in a packet. Say, with all > >overhead included, we had a 513-byte query, and we write our 512 byte packet > >(with a "more data" flag). Then we write our last byte, including its > >8-byte TDS header, of course. Will those 9 bytes stay parked in our local > >network buffers for some non-trivial time? Can that really account for your > >statistics? > > Nagle's algorithm, per RFC896, doesn't care about how "small" the packet is. > All it cares about is that the first packet hasn't been acknowledged yet. The > only "smallness" involved is the fact that with 512-byte TDS packets, the > query > in question will be sent in multiple writes to the socket. If the TDS packet > size is 4096 bytes, it will all be sent in a single write. > > Without disabling Nagle's algorithm, those 9 bytes will stay parked in the > buffer until the first 512 have been acked or a retransmit timer goes off, > whichever comes first. > > This could also be avoided by sending all of the TDS packets in a single > socket > write. (I.e. buffering them up.) But that's probably more trouble than it's > worth. > > >If I understand correctly, the remnant packet will wait in the client's > >buffer until its predecessor has been acknowledged i.e, until the window is > >wide open. I guess on an Ethernet the delay isn't noticeable, and > >database-style client/server interactions make somewhat atypical use of the > >network. It's hard for me to believe that's normally how things work, that > >there's no way to say, "OK, I'm done. It's his turn to talk now." > > >If I've got the above all correct, there are only two partitial solutions: > > >1. Ideally (I think), we would be able to set the TCP PUSH flag to indicate > >we're done. That would cause the TCP stack to transmit the not-full packet > >immediately, provided the window is open, without waiting for > >acknowledgement of the prior packet. Many (most?) implementations provide > >no interface to set the PUSH flag, however; according to the RFC, it's > >optional. > > The socket API, to my knowledge, does not provide a mechanism for setting the > PUSH flag. It essentially treats all writes as including the PUSH flag, and > then Nagle's algorithm overrides that. > > >2. Setting TCP_NODELAY, to force every packet out ASAP after write(2) > >completes. Again, though, not every setsockopt(2) supports this option. > > I'm not familiar with a socket implementation that doesn't provide > TCP_NODELAY. > In any case, you should be able to test for it. > > >Which makes your patch look pretty good, for those implementations that > >support it. > > I think you should go for it. Nagle's algorithm was designed to avoid > excessive packets containing only a single byte of data. 512-byte TDS packets > don't really fit that definition. They're a bit network-abusive by today's > standards, but not really by the standards of 1984 when RFC896 was written. > > If you're paranoid, you could mimic the Samba solution. Make it a > freetds.conf > option, and then recommend that everyone set it. > As anybody here should know TDS is an "half-duplex" protocol. Client send a request, process server reply and so on... Requests are usually small (just a select, insert or update with small data) and should arrive to server ASAP to get processed. From my tests TDS_NODELAY make a big difference so I enabled it by default. We already use buffers so I don't think it's a big problem. The only exception is huge request (like inserting an image into a db) with small packet (512). This will lead to a bit fragmentation... Under Linux we could use TCP_CORK flag. This can maximize throughput however is not a portable solution... Another solution should be use a bigger buffer (like 8/16K) even with small packets and enable TCP_NODELAY... Under FreeBSD there is a TCP_NOPUSH flag but the behavior is different from TCP_CORK... freddy77
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: freetds-dev on Solaris, Frediano Ziglio |
|---|---|
| Next by Date: | Re: CVS backup, James K. Lowden |
| Previous by Thread: | RE: Freetds\sql server lag, Craig Jackson |
| Next by Thread: | patch 20031120, ZIGLIO Frediano |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |