CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[SOT][Faster Guard] Implement moreguard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
#72463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements additional functionality for guard expression nodes and integrates a faster, specialized guard branch for tensor dtypes. Key changes include:
- Adding implementations of guard_tree_expr_node in virtual frame, function closure tracker, and inline executor.
- Implementing TensorDtypeVariable.make_faster_guard with a new branch for GetAttrTracker-based tensor variables.
- Updating guard_tree_expr_node return types across various tracker classes and modifying the executor cache and C++ guard handling (now throwing on an empty guard chain).
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
python/paddle/jit/sot/opcode_translator/executor/virtual_frame.py | Added guard_tree_expr_node method for closure cells using attribute and item expression nodes. |
python/paddle/jit/sot/opcode_translator/executor/variables/basic.py | Implemented TensorDtypeVariable.make_faster_guard with a new branch for GetAttrTracker-based cases. |
python/paddle/jit/sot/opcode_translator/executor/tracker.py & opcode_inline_executor.py | Updated return type of guard_tree_expr_node to ExprNodeBase and implemented similar changes in inline executor. |
python/paddle/jit/sot/opcode_translator/executor/function_graph.py | Modified the strategy for handling empty guard chains via a dummy guard chain. |
paddle/fluid/pybind/sot/guards.cc | Changed empty guard chain handling to throw a runtime error. |
python/paddle/jit/sot/opcode_translator/executor/executor_cache.py | Adjusted internal caching tuple structure and aligned guard index assertions. |
Comments suppressed due to low confidence (3)
python/paddle/jit/sot/opcode_translator/executor/function_graph.py:331
- Since the C++ side now throws a runtime error when the guard chain is empty, please verify that generating a dummy guard chain here consistently prevents empty guard chains in all execution scenarios.
guard_chain: GuardChain = [
paddle.framework.core.GuardNode(
paddle.framework.core.DummyGuard(),
[paddle.framework.core.ConstantExprNode(True)],
)
]
paddle/fluid/pybind/sot/guards.cc:405
- The updated behavior now throws an exception when the guard chain is empty. Please verify that all cases leading to an empty guard chain are either updated to supply a valid dummy guard or handled appropriately to prevent unexpected runtime failures.
throw std::runtime_error("Empty guard chain");
python/paddle/jit/sot/opcode_translator/executor/variables/basic.py:342
- Confirm that the PIR API check correctly reflects the runtime environment. If PIR is not enabled, ensure that falling back to object_equal_faster_guard still provides adequate guard coverage.
assert paddle.framework.use_pir_api(), "Only support PIR"
python/paddle/jit/sot/opcode_translator/executor/opcode_inline_executor.py
Outdated
Show resolved
Hide resolved
// TODO(zrr1999): empty guard nodes means that some | ||
// tracker.make_faster_guard is not implemented. | ||
return; | ||
throw std::runtime_error("Empty guard chain"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里能直接 PADDLE_THROW
么?
self.cache[code] = [ | ||
(new_custom_code, guard_fn) | ||
], paddle.framework.core.GuardTree([guard_chain]) | ||
self.cache[code] = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ruff 干的?
# ), f"cache_index({cache_index}) is not equal to index({index})" | ||
assert ( | ||
cache_index is None or index == cache_index | ||
), f"cache_index({cache_index}) is not equal to index({index})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
现在 CI 上都能测到了么?没有一点问题么?Tensor 那边还没完善呢
可以确认一下,如果测试体系有问题的话,后续推进是有风险的
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
guard_tree_expr_node
and TensorDtypeVariable.make_faster_guard
…typeVariable.make_faster_guard` (PaddlePaddle#72463)
PR Category
Execute Infrastructure
PR Types
Performance
Description
TODO
去掉GuardBase的概念,所有的Guard都改成基于 GuardNodeBase 的类,这样就不需要考虑check输入几个参数的问题了,每个类有自己的 lookup。GuardNodeBase子类分成 UnaryGuardNodeBase(功能类似目前的GuardNode),BinaryGuardNodeBase(可以基于 #72327 实现),原本的 ExprGuardNode 可基于 UnaryGuardNodeBase 实现实现复用。
实现 DummyGuardNode

FunctionGlobal 和 FunctionClosure 放在 tracker.py 统一管理吧