CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[SOT][3.11] Remove trailing PRECALL to avoid wrong specialization #59860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SOT][3.11] Remove trailing PRECALL to avoid wrong specialization #59860
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
贴一下问题分析记录: 现象在使用 import paddle
def foo():
print(1)
for i in range(10):
paddle.jit.to_static(foo)() 这是一个非常简单的包含打断且会触发 Python 每次 RESUME 都会 warmup 一次,在 warmup 第 8 次的时候会触发 quicken,将字节码替换为超指令或者自适应指令,而这里明显是触发 quicken 的时候或者之后挂掉了(这一点通过 pass 删除一个 RESUME 后会在第 8 次才挂得到了验证) 通过将 print替换为等价的能够触发 breakgraph 的非 builtin API 无法复现问题大概率猜出是 CALL 相关的特化指令出了问题,应该是其中存在某种假设,而我们生成的代码并不满足。 编译 Python 调试现象基本清楚,应该就是自适应首次特化时候就出了问题,那么接下来就是编译 Python 打印调试信息。 编译后的 debug 版本马上就有更加清晰的报错:
明显是我们生成的代码使 VM 达到了一个理应不可达的状态,在 #define DISPATCH_GOTO() \
do { \
if (should_print) { \
printf("[opcode_targets] %d\n", opcode); \
} \
goto *opcode_targets[opcode]; \
} while (0);
#else
...
TARGET(STORE_SUBSCR_ADAPTIVE) {
if (should_print) {
printf("STORE_SUBSCR_ADAPTIVE %d\n", oparg);
}
通过 log 可以发现是跑到了 也就是
现在已经明确是 TARGET(PRECALL_BUILTIN_FAST_WITH_KEYWORDS) {
assert(cframe.use_tracing == 0);
/* Builtin METH_FASTCALL | METH_KEYWORDS functions */
int is_meth = is_method(stack_pointer, oparg);
int total_args = oparg + is_meth;
PyObject *callable = PEEK(total_args + 1);
DEOPT_IF(!PyCFunction_CheckExact(callable), PRECALL);
DEOPT_IF(PyCFunction_GET_FLAGS(callable) !=
(METH_FASTCALL | METH_KEYWORDS), PRECALL);
STAT_INC(PRECALL, hit);
SKIP_CALL();
STACK_SHRINK(total_args);
/* res = func(self, args, nargs, kwnames) */
_PyCFunctionFastWithKeywords cfunc =
(_PyCFunctionFastWithKeywords)(void(*)(void))
PyCFunction_GET_FUNCTION(callable);
PyObject *res = cfunc(
PyCFunction_GET_SELF(callable),
stack_pointer,
total_args - KWNAMES_LEN(),
call_shape.kwnames
);
assert((res != NULL) ^ (_PyErr_Occurred(tstate) != NULL));
call_shape.kwnames = NULL;
/* Free the arguments. */
for (int i = 0; i < total_args; i++) {
Py_DECREF(stack_pointer[i]);
}
STACK_SHRINK(2-is_meth);
PUSH(res);
Py_DECREF(callable);
if (res == NULL) {
goto error;
}
CHECK_EVAL_BREAKER();
DISPATCH();
} 直接看并没有发现明显的 // Skip from a PRECALL over a CALL to the next instruction:
#define SKIP_CALL() \
JUMPBY(INLINE_CACHE_ENTRIES_PRECALL + 1 + INLINE_CACHE_ENTRIES_CALL) 啊……它直接不仅把自己的
解决方案这里单独出现的 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
单测名称变更
PR types
Bug fixes
PR changes
Others
Description
3.11 在进行 PRECALL 的 BUILTIN 指令特化时会假设下一个字节码一定是 CALL,但我们 COPY 后生成的字节码不是的,就会跳转到错误的位置,因此需要删除这个多余的 PRECALL
这个 PR 发版拉分支后再合入,发版前优先合入 #59855
另外清理一下 #59855 为保发版的 trick 以及 #59816 错误定位问题的修复方案
PCard-66972