logo       

filterMap, for comprehensions and collection-oriented functions: msg#00528

lang.scala

Subject: filterMap, for comprehensions and collection-oriented functions

I've had repeated occasion lately to filter lists so they match a
pattern, then act on the results.
Given:

trait Item {
type T
val value: T
}

class StringItem(value: String) extends Item {
type T = String
def lower = value.toLowerCase
}
class IntItem(value: Int) extends Item {
type T = Int
}

val items = List[Item](...)

With the latest scalac I can filter down to the StringItems very
cleanly:

for (val si @ StringItem(_) <- items) { println(si) }

What I often find is that I want to filter down to a subset, and then
map that through a function into another type. It's very convenient to
define the following:

implicit def seq2FilterMap[T](s: Seq[T]) = new FilterMap(s)
class FilterMap[T](s: Seq[T]) {
def filterMap[B](f: PartialFunction[T,B]) =
s filter (t => isDefinedAt t) map f
}

I can now write:

lst filterMap {
case StringItem(v) if v.charAt(1)=='o' => v
} foreach println

The equivalent for comprehension looks like:

(for (val si @ StringItem(v) <- lst; v.charAt(1)=='o') yield v) foreach
println

Without a comprehension and without filterMap, we'd write:

lst filter {
case StringItem(v) if v.charAt(1)=='o' => true
case _ => false
} map {
case StringItem(v) => v
} foreach println

This seems a bit awkward, and the addition of the filterMap method via
implicit coercion seems to clean it up nicely. Patterns used in for
comprehensions don't allow guards, but the boolean filtering capability
is fine.

FilterMap is sufficiently useful that it might merit inclusion in the
basic sequence classes. Adding the method via implicits carries a small
runtime cost (but that might be optimized out at some point in the
future).

A for comprehension currently supports patterns and filters, where the
expected type of a filter is boolean. What happens if the type of an
expression is something other than Boolean? What happens if I write:

for (val si @ StringItem(_) <- lst; sortSequence(si))
println(si)

It yields a syntax error, as si is not a sequence and cannot be sorted.
Consider this pseudocode annotated with types in '':

type G[_]=...generating type implementing .foreach, .map, etc.
type XGen=G[T]
for (val x'T' <- seq'XGen'; f'Function1[XGen, XGen]'(x)) println(x)

Each bound variable in a for comprehension pattern comes from a source
type of some kind that implements one or more of .foreach, .map,
.filter, and .flatMap.
If expression f does not evaluate to a Boolean, we might check to see if
it is a Function1, that accepts and returns the generating type for the
variable used as the argument. If we do this the sortSequence example I
gave above can work. This idea generalizes into checking the type of f
against PartialFunction[XGen,YGen], combining the notion of filtering,
mapping, and collection-oriented operations into a single abstraction.

We rewrite:

for (val si @ StringItem(v) <- lst;
v.charAt(1)=='o'; siSort(si))
println(si)

Into:

for (val si <- siSort(for (val si' @ StringItem(v) <- lst;
v.charAt(1)=='o') yield si'))
println(si)

It becomes harder to understand when our 'f' function maps from one type
into another:

def lookupIndex(si: StringItem): Index = ...
for (val si @ StringItem(v) <- lst;
v.charAt(1)=='o'; lookupIndex(si))
println(si)

This would rewrite to:

for (val si <- lookupIndex(for (val si' @ StringItem(v);
v.charAt(1)=='o') yield si'))
println(si)

Which has a counterintuitive change of the type of si. This could be
solved by introducing a val expression:

for (val si @ StringItem(v) <- lst;
v.charAt(1)=='o';
val xi <- lookupIndex(si))
println(xi)

But that is simply too close to existing syntax, and would be very hard
to distinguish for a reader because you'd have to know the expected type
of the lookupIndex function to understand the behavior. One possible
solution is the introduction of syntax for the generator of a variable:

for (val si @ StringItem(v) <- lst;
v.charAt(1)=='o';
val xi <- lookupIndex(->si))
println(xi)

Where ->var refers to the generator of var within the comprehension.

Sorry for the length of this, but I wanted to capture the thoughts.
Hopefully this problem (the introduction of collection-oriented
functions into for comprehensions generally) will interest Martin and
the Scala team, and the ideas outlined here might evolve into something
useful and part of the Scala platform.

RJ




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise