[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Interesting performance question

On 9/29/19, Anthony Flury via Python-list <python-list at python.org> wrote:
> Using python 3.6 building a tuple like this :
> my_tuple = tuple([x*x for x in range(1,1000)])

The list comprehension is implemented internally as a function that
builds and returns the list. This function creates an empty list and
loops over the range iterator to evaluate the expression and append
the result. Appending to a list uses an over-allocation strategy to
efficiently grow the list. Finally, the list is passed to the tuple
constructor, which can efficiently and quickly create a tuple from the
list because it's simply copying a PyObject * array in C.

>      my_tuple = tuple(x*x for x in range(1,1000))

In this case a generator is created instead of a list. This is passed
to the tuple constructor, which iterates the generator. There's no
__length_hint__() for a generator, so it starts with a length 10
tuple. The tuple grows with an over-allocation rule, and at the end
it's resized to the actual length.

I expect the generator-based expression to be a bit more expensive.
Iterating a generator requires resuming evaluation of the code object
up to a yield, which suspends evaluation. For the list comprehension,
the loop that builds the list executes continuously, without an
interruption to yield a value in each pass.