bpo-47067: Add vectorcall for gaobject by penguin-wwy · Pull Request #31996 · python/cpython

penguin-wwy · 2022-03-19T15:32:42Z

https://bugs.python.org/issue47067

Objects/genericaliasobject.c

corona10 · 2022-03-20T03:50:08Z

FYI, we decided to add caching mechanism instead of applying vector call. cc @sweeneyde
see #19677 (comment)

corona10 · 2022-03-20T16:38:47Z

@sweeneyde
To follow @gvanrossum 's decision, I will assign the PR to you :)
I am +1 about improving CPython performance if possible, so I have the same opinion with Guido too.

ref: https://bugs.python.org/issue47067#msg415616

Objects/genericaliasobject.c

sweeneyde · 2022-03-20T19:24:01Z

BTW, I replicated some speedup with pyperf commmands like

./python -m pyperf timeit -s "D = dict[str, int]" "D(a=1, b=2)" --duplicate 10 -o "dict[str, int](a=1, b=2).json"

Some results:

dict[str, int](a=1, b=2)      # 359 ns +- 6 ns -> 280 ns +- 3 ns: 1.29x faster
list[int](())                 # 259 ns +- 3 ns -> 246 ns +- 5 ns: 1.05x faster
MappingProxyType[str, int](d) # 273 ns +- 13 ns -> 261 ns +- 4 ns: 1.05x faster

class A:
    def __init__(self, a, b):
        pass
G = GenericAlias(A, int)
G(1, 2)                       # 198 ns +- 6 ns -> 190 ns +- 4 ns: 1.04x faster

sweeneyde

Some small code style / PEP 7 nits

Objects/genericaliasobject.c

bedevere-bot · 2022-03-20T19:32:41Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

gvanrossum · 2022-03-20T19:54:25Z

BTW, I replicated some speedup with pyperf commmands like

./python -m pyperf timeit -s "D = dict[str, int]" "D(a=1, b=2)" --duplicate 10 -o "dict[str, int](a=1, b=2).json"

Some results:

dict[str, int](a=1, b=2)      # 359 ns +- 6 ns -> 280 ns +- 3 ns: 1.29x faster
list[int](())                 # 259 ns +- 3 ns -> 246 ns +- 5 ns: 1.05x faster
MappingProxyType[str, int](d) # 273 ns +- 13 ns -> 261 ns +- 4 ns: 1.05x faster

class A:
    def __init__(self, a, b):
        pass
G = GenericAlias(A, int)
G(1, 2)                       # 198 ns +- 6 ns -> 190 ns +- 4 ns: 1.04x faster

So what does this speed up -- the dict[str, int] part, or the ...(a=1, b=2) part? (Also, the latter is probably not the fastest way to create a dict. The 5% faster results are more in line with my expectations.

sweeneyde · 2022-03-20T19:56:33Z

...(a=1, b=2)

The D(a=1, b=2) part is what I benchmarked, for all of those types.

Makefile.pre.in

gvanrossum · 2022-03-21T04:37:52Z

If we’re now changing Makefile this PR seems to have strayed from the topic?

sweeneyde · 2022-03-21T04:43:10Z

If we’re now changing Makefile this PR seems to have strayed from the topic?

I agree, we should probably keep the vectorcall changes for now, then explore global strings in another PR.

kumaraditya303 · 2022-03-21T04:51:04Z

Just FYI the order of execution of deep freeze and global string generator matter so don't change it in this PR.

penguin-wwy · 2022-03-21T05:09:26Z

I have made the requested changes; please review again :)

bedevere-bot · 2022-03-21T05:09:29Z

Thanks for making the requested changes!

@sweeneyde: please review the changes made to this pull request.

sweeneyde · 2022-03-21T05:46:46Z

Thanks for the changes, this is looking close.

I am still not convinced that adding ga_make_tp_call is worth it.

It eliminates a few branches/C-calls, but doesn't eliminate an allocation or much data movement like ga_vectorcall does.
Most other places in the code base, there is at most one vectorcall function per type, which is either present or NULL, so there's an argument to maintain consistency.
This is not a particularly hot function: list[int], dict[str, int], set[int], etc, all use the ga_vectorcall codepath, and GenericAlias(MyPythonClass, some_type)() will generally have lots of other overhead already.
It requires using the private function _PyObject_MakeTpCall and grabbing the ThreadState, making the implementation less obvious and more tightly coupled.
Three code paths is more complex than two.

penguin-wwy · 2022-03-21T06:19:40Z

In my mind, ga_make_tp_call will have an essential effect on tp_call, which still only needs to be packed once.

It requires using the private function _PyObject_MakeTpCall and grabbing the ThreadState, making the implementation less obvious and more tightly coupled.

For the purpose of simplicity and decoupling, it is appropriate to delete it :)

sweeneyde

Very close! One last thing.

Misc/NEWS.d/next/Library/2022-03-20-17-15-56.bpo-47067.XXLnje.rst

bedevere-bot · 2022-03-21T06:30:22Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

sweeneyde

Thanks!

Misc/NEWS.d/next/Library/2022-03-20-17-15-56.bpo-47067.XXLnje.rst

sweeneyde · 2022-03-21T20:33:21Z

The string change seems to have really improved the microbenchmarks. Before/after this PR:

./python -m pyperf timeit -s "D = dict[str, int]" "D(a=1, b=2)" --duplicate 10 -o "dictbench.json" --rigorous
Mean +- std dev: [./dictbench.json] 366 ns +- 6 ns -> [../ga_vectorcall/dictbench.json] 202 ns +- 2 ns: 1.81x faster

./python -m pyperf timeit -s "L = list[int]" "L(())" --duplicate 10 -o "listbench.json" --rigorous
Mean +- std dev: [./listbench.json] 258 ns +- 5 ns -> [../ga_vectorcall/listbench.json] 168 ns +- 4 ns: 1.54x faster

./python -m pyperf timeit -s "from types import GenericAlias; PyClass = GenericAlias(type('A', (), {}), int)" "PyClass()" --duplicate 10 -o "pyclassbench.json" --rigorous
Mean +- std dev: [./pyclassbench.json] 153 ns +- 3 ns -> [../ga_vectorcall/pyclassbench.json] 75.9 ns +- 0.9 ns: 2.01x faster

I also checked for refleaks just for fun, and ./python -m test test_types test_typing test_genericalias -R3:3 passed.

Thanks for the contribution!

gvanrossum · 2022-03-21T20:53:54Z

Great results! In the end, how was the "make regen-all" issue resolved?

sweeneyde · 2022-03-21T21:00:16Z

Great results! In the end, how was the "make regen-all" issue resolved?

I don't think it was solved, but ignored for the sake of this particular PR. I added a comment to https://bugs.python.org/issue46712 inquiring about it

Add vectorcall for gaobject

a1de38b

bedevere-bot added the awaiting review label Mar 19, 2022

the-knights-who-say-ni added the CLA signed label Mar 19, 2022

sweeneyde reviewed Mar 20, 2022

View reviewed changes

Objects/genericaliasobject.c Outdated Show resolved Hide resolved

corona10 assigned sweeneyde Mar 20, 2022

blurb-it bot and others added 3 commits March 20, 2022 17:15

📜🤖 Added by blurb_it.

b5ed1ed

Update set_orig_class function for gaobject

74c135c

Fix code format

9496dab

penguin-wwy force-pushed the ga_vectorcall branch from effb099 to 9496dab Compare March 20, 2022 17:26

sweeneyde reviewed Mar 20, 2022

View reviewed changes

Objects/genericaliasobject.c Outdated Show resolved Hide resolved

sweeneyde requested changes Mar 20, 2022

View reviewed changes

Objects/genericaliasobject.c Outdated Show resolved Hide resolved

Objects/genericaliasobject.c Outdated Show resolved Hide resolved

Objects/genericaliasobject.c Outdated Show resolved Hide resolved

Objects/genericaliasobject.c Outdated Show resolved Hide resolved

bedevere-bot removed the awaiting review label Mar 20, 2022

bedevere-bot added the awaiting changes label Mar 20, 2022

sweeneyde reviewed Mar 21, 2022

View reviewed changes

Makefile.pre.in Outdated Show resolved Hide resolved

Fix code style and add __orig_class__ to the global strings

5fd5165

penguin-wwy force-pushed the ga_vectorcall branch from 0b7c47c to 5fd5165 Compare March 21, 2022 05:05

bedevere-bot added awaiting change review and removed awaiting changes labels Mar 21, 2022

bedevere-bot requested a review from sweeneyde March 21, 2022 05:09

Remove gaobject make_tp_call

72b48ae

sweeneyde requested changes Mar 21, 2022

View reviewed changes

Misc/NEWS.d/next/Library/2022-03-20-17-15-56.bpo-47067.XXLnje.rst Outdated Show resolved Hide resolved

bedevere-bot added awaiting changes and removed awaiting change review labels Mar 21, 2022

sweeneyde approved these changes Mar 21, 2022

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting changes labels Mar 21, 2022

kumaraditya303 reviewed Mar 21, 2022

View reviewed changes

Misc/NEWS.d/next/Library/2022-03-20-17-15-56.bpo-47067.XXLnje.rst Outdated Show resolved Hide resolved

Update NEWS.d

2355648

penguin-wwy force-pushed the ga_vectorcall branch from e4e5065 to 2355648 Compare March 21, 2022 06:48

kumaraditya303 approved these changes Mar 21, 2022

View reviewed changes

sweeneyde merged commit 1ea055b into python:main Mar 21, 2022

bedevere-bot removed the awaiting merge label Mar 21, 2022

penguin-wwy mannequin mentioned this pull request Apr 20, 2022

Add vectorcall for generic alias object #91223

Closed

Uh oh!

Conversation

penguin-wwy commented Mar 19, 2022 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

corona10 commented Mar 20, 2022

Uh oh!

corona10 commented Mar 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sweeneyde commented Mar 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sweeneyde left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bedevere-bot commented Mar 20, 2022

Uh oh!

gvanrossum commented Mar 20, 2022

Uh oh!

sweeneyde commented Mar 20, 2022

Uh oh!

Uh oh!

gvanrossum commented Mar 21, 2022

Uh oh!

sweeneyde commented Mar 21, 2022

Uh oh!

kumaraditya303 commented Mar 21, 2022

Uh oh!

penguin-wwy commented Mar 21, 2022

Uh oh!

bedevere-bot commented Mar 21, 2022

Uh oh!

sweeneyde commented Mar 21, 2022

Uh oh!

penguin-wwy commented Mar 21, 2022

Uh oh!

sweeneyde left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bedevere-bot commented Mar 21, 2022

Uh oh!

sweeneyde left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sweeneyde commented Mar 21, 2022

Uh oh!

gvanrossum commented Mar 21, 2022

Uh oh!

sweeneyde commented Mar 21, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

penguin-wwy commented Mar 19, 2022 •

edited by bedevere-bot

Loading

corona10 commented Mar 20, 2022 •

edited

Loading

sweeneyde commented Mar 20, 2022 •

edited

Loading