|
Re: [FIX] SortedCollectionFix-sr: msg#00735lang.smalltalk.squeak.general
Andreas, I have made some tests (>>keysSortedSafelyAR is according your suggestion): | t1 t2 dicts | Smalltalk garbageCollect. dicts _ Dictionary allInstances. Smalltalk garbageCollect. t1 _ [10 timesRepeat: [dicts do: [:dict | dict keysSortedSafelyAR]]] timeToRun. Smalltalk garbageCollect. t2 _ [10 timesRepeat: [dicts do: [:dict | dict keysSortedSafely]]] timeToRun. {t1.t2} #(25972 22073) #(25942 22996) #(25575 22139) | t1 t2 bigDict | Smalltalk garbageCollect. bigDict _ (Dictionary allInstances asSortedCollection: [:d1 :d2 | d1 size > d2 size]) first. Smalltalk garbageCollect. t1 _ [10 timesRepeat: [bigDict keysSortedSafelyAR]] timeToRun. Smalltalk garbageCollect. t2 _ [10 timesRepeat: [bigDict keysSortedSafely]] timeToRun. {t1.t2. bigDict size} #(1716 1297 952) #(1718 1297 952) #(1742 1298 952) One change was necessary: Andreas Raab wrote: > Hi, > > >>I see, I've opened up a can of worms... > > > Not really. There are two independent issues here which make the > discussion much more complicated than it has to be. One is whether > #addFirst:/#addLast: should be allowed or not. I won't comment on it > since there are good arguments on either side but I think we all agree > that _if_ implemented they must not break the class invariant. > > The other issue is that of optimization. Here I would argue that first > of all, "you" (which means not really you personally but anyone who > thinks that there needs to be an optimization for the cases you are > quoting) have to show me that reverting those "optimizations" really has > an impact. I wouldn't even bother thinking about optimizing these cases > before there is hard evidence for the problem. > > So, for example, it would be trivial to rewrite > > Dictionary>>keysSortedSafely > ^SortedCollection > withAll: self keys > sortBlock:[...] > > SortedCollection class>>withAll: aCollection sortBlock: aBlock > "Answer an instance of me such that its elements > are sorted according to the criterion specified in aBlock." > ^(super new: aCollection size) > sortBlock: aBlock; > addAll: aCollection ; yourself Greetings, Stephan > > which is much more readable anyways and I would doubt that there's > really any speed impact. Same thing for Celeste - it is possible to > break the invariant in the creation method and whoever wrote that method > might as well not have used a SortedCollection to begin with. After all, > what's the point in using a class with a specific invariant if the first > thing you do is to break it?! Show me that it's slow and I'll show you > how to make it fast ;-) > > Cheers, > - Andreas > > >>-----Original Message----- >>From: squeak-dev-admin@xxxxxxxxxxxxxxxxxxxxxxxxxx >>[mailto:squeak-dev-admin@xxxxxxxxxxxxxxxxxxxxxxxxxx] On >>Behalf Of Stephan Rudlof >>Sent: Friday, September 27, 2002 2:15 PM >>To: squeak-dev@xxxxxxxxxxxxxxxxxxxxxxxxxx >>Subject: Re: [FIX] SortedCollectionFix-sr >> >> >>Dear Andreas and others, >> >>Andreas Raab wrote: >> >>>Hi Stephan, >>> >>>While I agree mostly with your reasoning (not entirely >> >>though because I >> >>>think Richard has a number of equally good arguments wrt. >> >>to consistency >> >>>of collection operations) there's one issue I strongly >> >>disagree with: >> >>> >>>>So there remains the performance thing: using SortedCollection>>add: >>>>multiple times means, that each object has to be sorted in >>>>immediately. It is probably more performant to have a private >>>>add method, which just adds an element, while leaving the >>>>SortedCollection in an inconsistent state, and reSort it afterwards, >>>>when all elements have been added. That has been my >>>>idea of introducing the 'bypass' method for clients which >>>>really know about the internals. >>> >>What I have written here is not entirely correct: it has been an >>interpretation of the *current* usage and intent of using >>addLast: for >>SortedCollections. My *idea* was to let it exist further. >>Note: I do *not* use >>addLast: (or >>addLastWithoutSorting >>in my fix) at >>the client side for calling SortedCollections. >> >> >>> >>>You don't have to use #add; you can use #addAll: which will >> >>choose the >> >>>right way to add the elements dynamically. Your proposal is actually >>>_really_ bad for optimization since it assumes "outside >> >>knowledge" about >> >>>the way SortedCollection is implemented and renders any attempt to >>>really optimize _it_ (instead of lots of clients guessing what a >>>reasonably good optimization would be) pretty much useless. >> >>Spreading >> >>>out "meager client side optimizations" while breaking the >> >>polymorphic >> >>>protocol is never a good idea - if you want to optimize >> >>then choose a >> >>>central, high-level, intent revealing place (such as, in fact, >>>SortedCollection>>addAll:) and don't ever have the client >>>second-guessing the structure and behavior of the object in >> >>question. >><...> >> >>I agree. >> >>Now I know better, why I have had some stomache going this way. >> >>And why I wrote: >> >>>Another - cleaner - variant avoiding these internal methods: >>>collecting the >>>to be added elements in another - not sorted! - Collection >> >>and calling >> >>>SortedCollection>>addAll: for them. >> >>But the world is not perfect: there *are* clients trying to >>optimize from >>outside (not written by me). I just wanted to fix the problem without >>breaking them and don't wanted to think deeper about them. >>But now I think I have to... >> >> >>Which are they? >> >>The first one is IndexFile>>readFrom: aStream messageFile: >>messageFile, its >>comment says: >> "Initialize myself from the given text stream. It is >>assumed that the >> .index file was written in order of ascending message timestamps, >> although this method is only less efficient, not incorrect, >>if this is not >> the case." >> >>Its code uses the assumption, that if starting with an empty >>SortedCollection >>addLast: is called repeatedly for a >>sequence of already >>sorted elements, this would result into a correct >>SortedCollection without >>calling >>reSort! >> >>It uses knowledge of the internal representation of >>SortedCollection and >>omits the class invariant ensuring >>reSort in some cases: dangerous! >> >>This should be fixed *independently* of my fix suggestions. >>(Note: I don't >>use Celeste: this may be a reason for another person to fix it.) >> >>But since performance may be *very* important in this client case: >>What about introducing a very special >> SortedCollection>>addAllProbablyPreSorted: >>(or >>addAllCheckForPreSortedFirst:) to give it a hint? >> >>Otherwise I hear other people crying... >> >> >>The other one is >> >>Dictionary>>keysSortedSafely >> "Answer a SortedCollection containing the receiver's keys." >> | sortedKeys | >> sortedKeys _ SortedCollection new: self size. >> sortedKeys sortBlock: >> [:x :y | "Should really be use <obj, string, >>num> compareSafely..." >> ((x isString and: [y isString]) >> or: [x isNumber and: [y isNumber]]) >> ifTrue: [x < y] >> ifFalse: [x class == y class >> ifTrue: [x printString < y printString] >> ifFalse: [x class name < y >>class name]]]. >> self keysDo: [:each | sortedKeys addLast: each]. >> ^ sortedKeys reSort >> >>This seems to be safe, since >>reSort is called in every >>case, but using >>SortedCollection>>addAll: seems to be better here. >> >>But let's take a look at >>SortedCollection>> >>addAll: aCollection >> aCollection size > (self size // 3) >> ifTrue: >> [aCollection do: [:each | super addLast: each]. >> self reSort] >> ifFalse: [aCollection do: [:each | self add: each]]. >> ^ aCollection >> >>Now I see a problem: the implementor (sma) of >>Dictionary>>keysSortedSafely >>possibly has seen, that in his case each element would be >>added one by one, >>instead of adding all unsorted followed by a reSort. And then >>he has tried >>to optimize it... >> >>How to deal with this case? >> >> >>I see, I've opened up a can of worms... >>But my conviction stays, that my suggestion is - evolutionary >>- better than >>the current status. And an evolutionary change in the right >>direction - >>though admittedly not ideal - is better than no change. >> >> >>Greetings, >> >>Stephan >> >> > > > > -- Stephan Rudlof (sr@xxxxxxxxx) "Genius doesn't work on an assembly line basis. You can't simply say, 'Today I will be brilliant.'" -- Kirk, "The Ultimate Computer", stardate 4731.3
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Yahoo archive truncated, Martin McClure |
|---|---|
| Next by Date: | RE: [FIX] SortedCollectionFix-sr, Ian Piumarta |
| Previous by Thread: | RE: [FIX] SortedCollectionFix-sr, Andreas Raab |
| Next by Thread: | Re: [FIX] SortedCollectionFix-sr, Stephan Rudlof |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |