[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Do not promote `None` as the first argument to `filter` in documentation.

On Tue, 06 Mar 2018 11:52:22 +0300, Kirill Balunov wrote:

> I propose to delete all references in the `filter` documentation that
> the first argument can be `None`, with possible depreciation of `None`
> as the the first argument - FutureWarning in Python 3.8+ and deleting
> this option in Python 4.

Even if we agreed that it is unfortunate that filter accepts None as an 
argument, since it does (and has done since Python 1.0) there is nothing 
to be gained by deprecating and removing it.

Deprecating and removing it will break code that currently works, for no 
good reason; removing the documentation is unacceptable, as that makes it 
too difficult for people to find out what `filter(None, values)` does.

> Instead, it is better to show an example with using
> `filter(bool, iterable)` which is absolutely
> equivalent, more readable, but a little bit slower.

So long as `filter(None, ...)` is still documented, I don't mind what 
example is given.

But the idiom `filter(None, ...)` is an old, common idiom, very familiar 
to many people who have a background in functional programming.

It is unfortunate that filter takes the arguments in the order it does. 
Perhaps it would have been better to write it like this:

def filter(iterable, predicate=None):

Then `filter(values, None)` would be a standard Python idiom, explicitly 
saying to use the default predicate function. There is no difference to 
`filter(None, values)` except the order is (sadly) reversed.

> Currently documentation for `None` case uses `identity function is
> assumed`, what is this `identity` and how it is consistent with
> truthfulness?

The identity function is a mathematical term for a function that returns 
its argument unchanged:

def identity(x):
    return x

So `filter(func, values)` filters according to func(x); using None 
instead filters according to x alone, without the expense of calling a do-
nothing function:

# slow because it has to call the lambda function each time;
filter(lambda x: x, values)

# fast because filter takes an optimized path
filter(None, values)

Since filter filters according to the truthy or falsey value of x, it 
isn't actually necessary to call bool(x). In Python, all values are 
automatically considered either truthy or falsey. The reason to call 
bool() is to ensure you have a canonical True/False value, and there's no 
need for that here. So the identity function should be preferred to bool, 
for those who understand two things:

- the identity function (using None as the predicate function) 
  returns x unchanged;

- and that x, like all values, automatically has a truthy value in a
  boolean context (which includes filter).

> In addition, this change makes the perception of `map` and `filter` more
> consistent,with the rule that first argument must be `callable`.

I consider that a flaw in map. map should also accept None as the 
identity function, so that map(None, iterable) returns the values of 
iterable unchanged.

def map(function=None, *iterables):
    if len(iterables) == 0:
        raise TypeError("map() must have at least two arguments.")
    if function is None:
        if len(iterables) > 1:
            return zip(*iterables)
            assert len(iterables) == 1
            return iter(iterables[0])
    elif len(iterables) > 1:
        return (function(*args) for args in zip(*iterables))
        assert len(iterables) == 1
        return (function(arg) for arg in iterables[0])