osdir.com
mailing list archive

Subject: Matthew Quinney is out of the office. - msg#00164

List: linux.rpm.yum

Date: Prev Next Index Thread: Prev Next Index




I will be out of the office starting 13/10/2003 and will not return until
20/10/2003.

I will respond to your message when I return.


Was this page helpful?
Yes No
Thread at a glance:

Previous Message by Date: click to view message preview

Re: The Future of urlgrabber

On Mon, 2003-10-13 at 09:58, Michael Stenner wrote: > > Also, it's the same as arguing that you should just suck copies of > > every library into every application or at the minimum that you > > should always do static linking. > > Slow down there. Every library into every application? Saying you > should write _a_ program in python is not the same as saying you > should write EVERY program in python. By no means would a decision to > write urlgrabber for "slurping" imply that all libraries should be > written that way. Yes, I'm somewhat sensationalist -- but being reasonable never seems to get results ;) Also, everything here is generalizations; there are cases where what I describe doesn't happen, but they're the exception rather than the rule. And counting on being the exception tends to be a losing strategy in my experience. Unfortunately, it's pretty clear from my watching things that when you start sucking libraries (which is what urlgrabber is, really), things are ugly in almost all cases. eg, problems with libegg getting out of date in various GNOME apps using it and thus causing problems, having to do upgrades of six programs because of an exploit in one library (*cough*zlib*cough*), etc. It also tends to encourage a lack of API stability which is just as painful for users of your library when they want to upgrade for bugfixes :/ Plus, sucking in copies tends to lead to forks because "well, I've got the code here, why not make this little change that makes my life easier". It's far harder to do that when the library is external. > We now have three alternatives on the table: > 1) simple external > + very tidy > - major constraints on the code (preserve backward compat, etc) > + apps get automatic bugfixes with urlgrabber upgrade Well, you obviously start with this in any case. > 2) parallel external > - not so tidy > + fewer constraints (change major num when BC breaks) > + apps get automatic bugfixes with urlgrabber upgrade You only do this when you have to. It's not the sort of thing that should be happening often. It should be planned well in advance and gone into knowing that you're breaking compatibility against all wants, hopes and desires :) > 3) slurped internal > + very tidy > + no constraints (apps slurp whenevery they want/can) > - no automatic bugfixes - must slurp new version See above :) Cheers, Jeremy

Next Message by Date: click to view message preview

Re: The Future of urlgrabber

On Mon, Oct 13, 2003 at 03:25:19PM -0400, Jeremy Katz wrote: > Unfortunately, it's pretty clear from my watching things that when you > start sucking libraries (which is what urlgrabber is, really), things > are ugly in almost all cases. eg, problems with libegg getting out of > date in various GNOME apps using it and thus causing problems, having to > do upgrades of six programs because of an exploit in one library > (*cough*zlib*cough*), etc. It also tends to encourage a lack of API > stability which is just as painful for users of your library when they > want to upgrade for bugfixes :/ > > Plus, sucking in copies tends to lead to forks because "well, I've got > the code here, why not make this little change that makes my life > easier". It's far harder to do that when the library is external. OK, those are all excellent points. I'm 99% convinced. > > 2) parallel external > > - not so tidy > > + fewer constraints (change major num when BC breaks) > > + apps get automatic bugfixes with urlgrabber upgrade > > You only do this when you have to. It's not the sort of thing that > should be happening often. It should be planned well in advance and > gone into knowing that you're breaking compatibility against all wants, > hopes and desires :) Just a curiosity. How would this work for a python module? I'm thinking that urlgrabber will take on a structure like this: urlgrabber/__init__.py urlgrabber/<main-file-formerly-"urlgrabber.py"> urlgrabber/keepalive.py urlgrabber/progress_meter.py And then if the parallel external route gets taken, it be done as: urlgrabber2/__init__.py ... Is that what you would have in mind? -Michael -- Michael Stenner Office Phone: 919-660-2513 Duke University, Dept. of Physics mstenner@xxxxxxxxxxxx Box 90305, Durham N.C. 27708-0305

Previous Message by Thread: click to view message preview

gpgcheck

When doing 'yum update' yum downloads all the required packages - and then does the GPG check. If it fails - it gives an error (with the package name for the failed check) and aborts. Can this behavior be changed - so that it does the GPG check for all packages - and gives the complete list of packages that the check failed on - before aborting? thanks, Satish

Next Message by Thread: click to view message preview

new urlgrabber design

Again, if you don't know what urlgrabber is, you don't need to read this. I am actively requesting input from Jeremy, Seth, and Icon. I would love to input from others as well (Ryan?), but these are the ones that will get he beatings. Here is the basic design that I have in mind. This (intentionally) has no mention of internal workings. It only discusses things that matter to someone that would USE the module. Internal design is certainly open for discussion, but I only want to talk about it now to the extent that it affects interface. -Michael ======================================================================= MAIN FUNCTIONS: urlgrab -- Fetch a url and make a local copy. Return the filename urlopen -- Return a file object for the specified url. urlread -- Read the specified file into a string and return int. retrygrab -- Wrapper for urlgrab the retries given certain errors. retryopen -- Wrapper for urlopen the retries given certain errors. retryread -- Wrapper for urlread the retries given certain errors. NOTE: retryopen can't protect you from errors that occur AFTER the connection is made. It can only retry setting up the connection. FEATURES: * identical behavior for http, ftp, and file Options that change the behavior for one protocol (like copy_local) are OK as long as they don't affect the other protocols. However, something like byte-ranges MUST work for all protocols. These are different because byte-ranges CHANGE the return value for a given input. copy_local only modifies the internal behavior. All options must by syntactically legal for ALL urls. The whole point is to have the library not care what sort of url is passed in. * smart url interpretation - handle "normal local filenames" also - handle url-encoded username/password for ftp and http (and file? smb?) * byte ranges * reget support - internally supported via byte ranges - several reget modes + never: always start from the beginning + force: always pick up from the end of the local file + smart: check timestamps, length, etc. * throttling * progress meter * i18n support (if the calling application provides translations) * settable User-Agent * http keepalive (via the keepalive module) * timestamp preservation INTERFACE: I'm considering changing the function interface a little. There are just getting to be an insane number of options, and I'm not sure how to deal with it. There is also the issue of passing options through retry*. Option 1 (the way it is now, everything is a kwarg) def urlgrab(url, filename=None, copy_local=0, close_connection=0, progress_obj=None, throttle=None, bandwidth=None): def retrygrab(url, filename=None, copy_local=0, close_connection=0, progress_obj=None, throttle=None, bandwidth=None, numtries=3, retrycodes=[-1,2,4,5,6,7], checkfunc=None): This is REALLY ugly and it makes it very hard to cleanly add options. Specifically, what if someone does: retrygrab(url, fn, 1, 0, None, None, None, 5) # the last is numtries and then we later add more options to urlgrab? Sure, it's not likely, and sure, I put a warning to only use these as kwargs in the doc, but still. It's very icky. However, it is very clear and very normal. Option 2 def urlgrab(url, filename=None, **kwargs): def retrygrab(url, filename=None, **kwargs): retrygrab could then strip out the options it cares about and pass on the rest. This makes the function definition very clean, but completely useless to look at. The legal args would have to go in the docs. One of the up-sides is that things could ONLY be called as keyword args so the ordering is irrelevant. Option 3 def urlgrab(url, filename=None, options=None): def retrygrab(url, filename=None, optionsNone): Same as 2, but instead of calling as: urlgrab(url, copy_local=1) it must be urlgrab(url, options={'copy_local':1}) I don't really like this option. It's just a step on the way to the next one :) Option 4 def urlgrab(url, filename=None, options=None): def retrygrab(url, filename=None, options=None, retry_options=None): Here, the options arg to retrygrab would get passed through untouched, and retry_options would be ONLY for options related to the retry process. I'm open to other ideas... If I had to pick now, I'd probably go with (2), but I'm still quite open. STRUCTURE: Because urlgrabber already consists of at least two files (urlgrabber.py and keepalive.py), I'm thinking of making it a "package" (directory with sub-modules inside). One might argue that this is the only sane way to go if it's going to be a tidy library. This will also make life much easier if we need to do "parallel installs" farther down the road. Then again, maybe keepalive.py and progress_meter.py should be separate! -- Michael Stenner Office Phone: 919-660-2513 Duke University, Dept. of Physics mstenner@xxxxxxxxxxxx Box 90305, Durham N.C. 27708-0305
Loading Comments...
Home | News | Patents | Sitemap | FAQ | advertise

Advertising by