logo       

[FIX] SortedCollectionFix-sr v2: msg#00755

lang.smalltalk.squeak.general

Subject: [FIX] SortedCollectionFix-sr v2

"
Change Set: SortedCollectionFix-sr v2
Date: 28 September 2002
Author: Stephan Rudlof

This is an evolutionary fix/extension of SortedCollection:
- >>copyFrom:to: should use the sortBlock of the receiver instead of the
default nil sortBlock;
- >>addLast: should be forbidden as >>addFirst is already;
- >>addLastWithoutSorting has been introduced as 'private' method for
constructing a copy by the newly introduced >>copyFrom:to:.
- clients fixed, which have send >>addLast to SortedCollections before:
- IndexFile>>readFrom:messageFile: has to be tested (see method comment),
ideally by a Celeste maintainer,
- Dictionary>>keysSortedSafely has been fixed (it could break the class
invariant of SortedCollection, if its internal implementation would change)
without a relevant performance loss;
- class>>withAll:sortBlock: is a new 'instance creation' method.

Rationale:
I'm expecting an OrderedCollection, if I'm sending >>addLast:. ANSI v1.9
supports this view: it says, that >>addLast: behaves only to the protocol
<OrderedCollection> and the protocol of <SortedCollection> does *not*
conform to this protocol.
>>addLast should always work for OrderedCollections, but it would only work
in *rare* cases for SortedCollections. So I'm against making it work for
these rare cases for SortedCollections. BTW: I'm not against introducing
something like >>addAllProbablyPresorted:, but I don't do it here.

Note: there are other changeset with SortedCollection fixes (dealing with
e.g. >>addLast:), but they haven't reached Squeak3.2 yet.

History:
v2
: - no client has to call >>addLastWithoutSorting anymore, it is just a
private method used by its own class now (in >>copyFrom:to:),
- proposal of Richard added, cs comment changed and extended
v1
: first version

There is an interesting proposal of Richard A. O'Keefe, which hasn't been
implemented yet (I have changed doublequotes to double doublequotes):

---------------------------------------------------------------
I think it's time to talk about lazy data structures.

We could add elements fast to SortedCollections like this.

1. The internal representation is an array partitioned thus:
+----+------------+--------+----+
|1 |a z| x| n|
+----+------------+--------+----+

elements 1..a-1 : all nil
elements a..z : sorted according to sortBlock
elements z+1..x : added elements not sorted yet
elements x+1..n : all nil

Where I have written ""a"" and ""x"" you should understand the existing
instance variables of OrderedCollection (inherited by SortedCollection)
being used instead; the ""z"" instance variable is new.

2. add: anItem
^super addLast: anItem
addFirst: anItem
self shouldNotImplement
addLast: anItem
self shouldNotImplement
addAll: aCollection
^super addAllLast: aCollection
addAllFirst: aCollection
self shouldNotImplement
addAllLast: aCollection
self shouldNotImplement

No, I'm not being inconsistent here. I'm saying ""let someone build
a sorted collection in linear time if they can provide the elements
in sorted order."" With this kind of implementation, #add: _is_
adequate for the job.

3. A strictly private method would exist:

pvtEnsureAllSorted ""rather like current addAll:""
x = z ifTrue: [^array].
self pvtSortAddedElements. ""sorts z+1..x""
self pvtMerge. ""merges a..z with z+1..x""
z := x.
^array

Ideally pvtSortAddedElements would use a method which is linear when
the elements are sorted already; Dijkstra's smooth-sort is
just one of many algorithms which are good at that, bottom-up
natural merge is easier to understand and works very well in practice.

Using bottom-up natural merge, if you add M elements to a Sorted
Collection having N elements and then look at the result,
the adds would cost O(M) time,
pvtEnsureAllSorted would cost O(M) for pvtSortAddedElements
and O(M+N) for pvtMerge, for a total O(N+M).
Building an entire collection from N=0 would thus be O(M).

4. Other methods would call pvtEnsureAllSorted, e.g.

at: index
self assert: [index between: 1 and: x-a+1].
^self pvtEnsureAllSorted at: index-a+1

do: aBlock
self pvtEnsureAllSorted.
^super do: aBlock

5. Setting the sortblock would simply enlarge the unsorted area:

sortBlock: aBlock
sortBlock := aBlock ifNil: [nil] ifNotNil: [aBlock fixTemps].
z := a-1.


This is the clean way to do it. The class invariant is weakened from
""the elements are always sorted"" to ""the elements are always sorted
whenever anyone else looks; when noone else is looking they are in a
convenient configuration for sorting."" From the outside, it looks as
though the elements are always sorted.
---------------------------------------------------------------

This may be an improvement to the current status.
"
--
Stephan Rudlof (sr@xxxxxxxxx)
"Genius doesn't work on an assembly line basis.
You can't simply say, 'Today I will be brilliant.'"
-- Kirk, "The Ultimate Computer", stardate 4731.3

Attachment: SortedCollectionFix-sr.2.cs.gz
Description: application/gunzip

<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise