osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] BDFL-Delegate appointments for several PEPs


Hi Petr,

On 27/03/2019 1:50 pm, Petr Viktorin wrote:
> On Sun, Mar 24, 2019 at 4:22 PM Mark Shannon <mark at hotpy.org> wrote:
>>
>> Hi Petr,
>>
>> Regarding PEPs 576 and 580.
>> Over the new year, I did a thorough analysis of possible approaches to
>> possible calling conventions for use in the CPython ecosystems and came
>> up with a new PEP.
>> The draft can be found here:
>> https://github.com/markshannon/peps/blob/new-calling-convention/pep-9999.rst
>>
>> I was hoping to profile a branch with the various experimental changes
>> cherry-picked together, but don't seemed to have found the time :(
>>
>> I'd like to have a testable branch, before formally submitting the PEP,
>> but I'd thought you should be aware of the PEP.
>>
>> Cheers,
>> Mark.
> 
> Hello Mark,
> Thank you for letting me know! I wish I knew of this back in January,
> when you committed the first draft. This is unfair to the competing
> PEP, which is ready and was waiting for the new govenance. We have
> lost three months that could be spent pondering the ideas in the
> pre-PEP.

I realize this is less than ideal. I had planned to publish this in 
December, but life intervened. Nothing bad, just too busy.

> Do you think you will find the time to piece things together? Is there
> anything that you already know should be changed?

I've submitted the final PEP and minimal implementation
https://github.com/python/peps/pull/960
https://github.com/python/cpython/compare/master...markshannon:vectorcall-minimal

> 
> Do you have any comments on [Jeroen's comparison]?

It is rather out of date, but two comments.
1. `_PyObject_FastCallKeywords()` is used as an example of a call in 
CPython. It is an internal implementation detail and not a common path.
2. The claim that PEP 580 allows "certain optimizations because other 
code can make assumptions" is flawed. In general, the caller cannot make 
assumptions about the callee or vice-versa. Python is a dynamic language.

> 
> The pre-PEP is simpler then PEP 580, because it solves simpler issues.

The fundamental issue being addressed is the same, and it is this:
Currently third-party C code can either be called quickly or have access 
to the callable object, not both. Both PEPs address this.

> I'll need to confirm that it won't paint us into a corner -- that
> there's a way to address all the issues in PEP 579 in the future.

PEP 579 is mainly a list of supposed flaws with the 
'builtin_function_or_method' class.
The general thrust of PEP 579 seems to be that builtin-functions and 
builtin-methods should be more flexible and extensible than they are. I 
don't agree. If you want different behaviour, then use a different 
object. Don't try an cram all this extra behaviour into a pre-existing 
object.

However, if we assume that we are talking about callables implemented in 
C, in general, then there are 3 key issues covered by PEP 579.

1. Inspection and documentation; it is hard for extensions to have 
docstrings and signatures. Worth addressing, but completely orthogonal 
to PEP 590.
2. Extensibility and performance; extensions should have the power of 
Python functions without suffering slow calls. Allowing the C code 
access to the callable object is a general solution to this problem. 
Both PEP 580 and PEP 590 do this.
3. Exposing the underlying implementation and signature of the C code, 
so that optimisers can avoid unnecessary boxing. This may be worth 
doing, but until we have an adaptive optimiser capable of exploiting 
this information, this is premature. Neither PEP 580 nor PEP 590 
explicit allow or prevent this.

> 
> The pre-PEP claims speedups of 2% in initial experiments, with
> expected overall performance gain of 4% for the standard benchmark
> suite. That's pretty big.

That's because there is a lot of code around calls in CPython, and it 
has grown in a rather haphazard fashion. Victor's work to add the 
"FASTCALL" protocol has helped. PEP 590 seeks to formalise and extend 
that, so that it can be used more consistently and efficiently.

> As far as I can see, PEP 580 claims not much improvement in CPython,
> but rather large improvements for extensions (Mistune with Cython).

Calls to and from extension code are slow because they have to use the 
`tp_call` calling convention (or lose access to the callable object).
With a calling convention that does not have any special cases,
extensions can be as fast as builtin functions. Both PEP 580 and PEP 590 
attempt to do this, but PEP 590 is more efficient.

> 
> The pre-PEP has a complication around offsetting arguments by 1 to
> allow bound methods forward calls cheaply. I fear that this optimizes
> for current usage with its limitations.

It's optimising for the common case, while allowing the less common.
Bound methods and classes need to add one additional argument. Other 
rarer cases, like `partial` may need to allocate memory, but can still 
add or remove any number of arguments.

> PEP 580's cc_parent allows bound methods to have access to the class,
> and through that, the module object where they are defined and the
> corresponding module state. To support this, vector calls would need a
> two-argument offset.

Not true. The first argument in the vector call is the callable itself. 
Through that it, any callable can access its class, its module or any 
other object it wants.

> (That seems to illustrate the main difference between the motivations
> of the two PEPs: one focuses on extensibility; the other on optimizing
> existing use cases.)

I'll reiterate that PEP 590 is more general than PEP 580 and that once 
the callable's code has access to the callable object (as both PEPs 
allow) then anything is possible. You can't can get more extensible than 
that.

> 
> The pre-PEP's "any third-party class implementing the new call
> interface will not be usable as a base class" looks quite limiting.

PEP 580 has the same limitation for the same reasons. The limitation is 
necessary for correctness if an object supports calls via `__call__` and 
through another calling convention.

> 
> 
> 
> [Jeroen's comparison]:
> https://mail.python.org/pipermail/python-dev/2018-July/154238.html
> 


Cheers,
Mark.