|
Place for efficiency improvement [was: Re: [BUG]Collection>>removeAll:]: msg#01344lang.smalltalk.squeak.general
Richard, Wow, this is dawn interesting! Independent of the discussion regarding the semantics of the [remove|add]All: family you have found an important place for possible efficiency improvements. Richard A. O'Keefe wrote: > NEW RESULTS! > > There are people in this mailing list who are of the opinion > that the quirk in the implementation of #removeAll: is not a bug > but an obvious precondition that programmers should always have been > aware of, so it doesn't need fixing. > > I am deeply indebted to mwai@xxxxxxxxx for telling me that > > |coll| > coll := OrderedCollection with: 1 with: 2. > coll removeAll: coll > coll inspect --> OrderedCollection() > > _works_ in IBM Smalltalk, despite IBM Smalltalk having the same > implementation of #removeAll:. > > This raised the extremely interesting possibility that it might have > worked in Squeak once upon a time. I decided to look for evidence. > If IBM Smalltalk got it right despite having the same #removeAll:, > the difference must be in #remove:ifAbsent:. > > Here is how #remove:ifAbsent: works in Squeak. > > remove: oldObject ifAbsent: absentBlock > |index| > index := firstIndex. > [index <= lastIndex] whileTrue: [ > oldObject = (array at: index) > ifTrue: [self removeIndex: index. ^oldObject] > ifFalse: [index := index + 1]]. > ^absentBlock value > > removeIndex: removedIndex > array > replaceFrom: removedIndex > to: lastIndex - 1 > with: array > startingAt: removedIndex + 1. > array at: lastIndex put: nil. > lastIndex := lastIndex - 1. > > Now removeIndex: is NOT original Holy Writ, it was changed by 'ar' > in 2002.05.22. Presumably it was changed in order to exploit the > #replaceFrom:to:with:startingAt: primitive (105). I have no idea > what used to be there before. > > While the comment on Array>>replaceFrom:to:with:startingAt: does > not say so, I presume that the primitive does the same as > SequenceableCollection>>replaceFrom:to:with:startingAt:, > only faster. > > That's a pity, because that means that we can only move array elements > DOWN to fill a hole, not UP, so that the first element of an ordered > collection is the most expensive one to remove. > > A consequence of this is that > oc removeAll: (oc copy) > takes quadratic time, when it _could_ take linear time. > > Suppose we decide to make removing the _first_ element cheap by > changing removeIndex: to > > removeIndex: removedIndex > removedIndex = firstIndex > ifTrue: [ > array at: firstIndex put: nil. > firstIndex := firstIndex + 1. > ^self]. > array > replaceFrom: removedIndex > to: lastIndex - 1 > with: array > startingAt: removedIndex + 1. > array at: lastIndex put: nil. > lastIndex := lastIndex - 1. > > NOW anOrderedCollection removeAll: anOrderedCollection works. > Without any change to #removeAll: at all! > > Now, I don't _know_ whether removing the first element of an ordered > collection used to be cheap, but if ever it was, then nobody would have > thought to document the hole in the implementation of > OrderedCollection>>removeAll: because there wouldn't have _been_ any > hole, aliasing or no aliasing. > > In the absence of <primitive: 105>, how might removeIndex: have been > written? Obviously, to do the fewest moves: > > removeIndex: removedIndex > (lastIndex + firstIndex) // 2 < removedIndex > ifTrue: [ > removedIndex + 1 to: lastIndex do: [:index | > array at: index - 1 put: array at: index]. > array at: lastIndex put: nil. > lastIndex := lastIndex - 1] > ifFalse: [ > removedIndex - 1 to: 1 by: -1 do: [:index | > array at: index + 1 put: array at: index]. > array at: firstIndex put: nil. > firstIndex := firstIndex + 1]. What about the ifFalse: case realized by a new Array prim (the ifTrue: case corresponds to the current implementation); e.g.: Array>>removeIndex: removedIndex (lastIndex + firstIndex) // 2 < removedIndex ifTrue: [array replaceFrom: removedIndex to: lastIndex - 1 with: array startingAt: removedIndex+1. array at: lastIndex put: nil. lastIndex _ lastIndex - 1] ifFalse: [array replaceBackwardsFrom: removedIndex to: firstIndex + 1 with: array startingAt: removedIndex-1. array at: firstIndex put: nil. firstIndex _ firstIndex + 1] (note: code not tested); the new prim would be needed for the new method Array>>replaceBackwardsFrom:to:startingAt: . I think this would be a valuable improvement. Comments? Greetings, Stephan > > And there would not have been any hole in the implementation of > OrderedCollection>>removeAll:. > > So the "don't change it" brigade may be fighting to preserve a bug > which was introduced comparatively recently. > > I'm checking this stuff in Squeak 2.8. I notice that > OrderedCollection>>removeallSuchThat: used > #removeIndex: to remove elements, thus taking a method that _could_ > very easily have been linear time and making it take quadratic time. > That was added or changed by 'sma' in 2002.05.12. It looks as though > all the methods in OrderedCollection that call #removeIndex: went in > at about the same time. > > At any rate, it is now clear that Squeak cannot be precisely compatible > with *both* VW (where coll removeAll: coll is said not to work) > *and* IBM Smalltalk (where it is said to work). So we might as well be > compatible with the one that works. > > > -- Stephan Rudlof (sr@xxxxxxxxx) "Genius doesn't work on an assembly line basis. You can't simply say, 'Today I will be brilliant.'" -- Kirk, "The Ultimate Computer", stardate 4731.3
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: [BUG]Collection>>removeAll:, goran . hultgren |
|---|---|
| Next by Date: | Re: [BUG]Collection>>removeAll:, goran . hultgren |
| Previous by Thread: | Re: [BUG]Collection>>removeAll:, Jon Hylands |
| Next by Thread: | Re: [BUG]Collection>>removeAll:, Martin Wirblat |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |