-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-134584: Eliminate redundant refcounting from _CALL_TYPE_1
#135818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…nner/cpython into eliminate_store_fast_refcount
19ed708
to
184d8f1
Compare
POP_TOP + | ||
POP_TOP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are for callable
and arg
. I'm not sure if it even makes sense to specialize it for a callable
or type()
directly?
Also, arg
can be anything so I don't think we can do much here. The JIT can still specialize POP_TOP
for arg
if its type is known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right it doesn't make sense to specialize it by type. But we can still do useful elimination. Consider the following code: sin(x)
. If x
is a local, it will emit a LOAD_FAST_BORROW, which will in turn cause a POP_TOP_NOP (ie, a no-op once my PR lands with TOS caching!). So you managed to eliminate one tagged reference count decref from your PR.
This roughly translates to removing one branch and one bitwise and in the JIT. From my own experience, branches are quite costly in the JIT, so this is great work!
In the future, we can also do the following optimization: for a sin(x)
, once we promote it and emit _LOAD_CONST_INLINE
, we can then check for any intercepting escaping calls between the _LOAD_CONST_INLINE
and usage. If there are no escaping calls, we can just emit a _LOAD_CONST_INLINE_BORROW
, allowing even the callable's reference count to be eliminated completely!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good, just 2 comments that also explain the CI failure.
@@ -949,7 +970,7 @@ dummy_func(void) { | |||
next = sym_new_type(ctx, &PyLong_Type); | |||
} | |||
|
|||
op(_CALL_TYPE_1, (unused, unused, arg -- res)) { | |||
op(_CALL_TYPE_1, (callable, null, arg -- res, c, a)) { | |||
PyObject* type = (PyObject *)sym_get_type(arg); | |||
if (type) { | |||
res = sym_new_const(ctx, type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to replace the constant propagation code here, it's no longer _POP_CALL_ONE_LOAD_CONST_INLINE_BORROW
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah that explains the failure. I was actually wondering if I need to update this optimization. Thanks!
} | ||
|
||
macro(CALL_TYPE_1) = | ||
unused/1 + | ||
unused/2 + | ||
_GUARD_NOS_NULL + | ||
_GUARD_CALLABLE_TYPE_1 + | ||
_CALL_TYPE_1; | ||
_CALL_TYPE_1 + | ||
POP_TOP + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a _POP_TOP_NOP
as type
is immortal. You can add a comment there explaining this beside the uop append.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh I had _POP_TOP_NOP
initially but I hit some assertion related to borrowed references/immortality. I'll try again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I realised this can be even simpler, The signature should be
_CALL_TYPE_1, (callable, null, arg -- res, a))
as you saw callable doesn't need to have anything done on it.
Thanks for the review! I'll continue working on it tomorrow (Monday) 🙂 |
Depends on #135761