You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Unify FLAGS_cinn_parallel_compile_size and FLAGS_cinn_parallel_compile_thread
2. Add more flags to dump more compile info
3. Support dump lower_func, source_code, ptx and instruction
4. Return more compile info from parallel compiler to graph compiler
5. Refactor graph visualization
6. Refactor parallel compiler's task split
你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR types
Others
PR changes
Others
Description
Pcard-72511
Parallel Compiler
:FLAGS_cinn_parallel_compile_size
和FLAGS_cinn_parallel_compile_thread
,通过FLAGS_cinn_parallel_compile_thread
即可指定编译时使用的线程数,所有的fusion_groups
将会平均分配到可用的线程上instruction
外,将lowered_function
、source_code
、source_ptx
返回,供上层进一步使用FLAGS_ cinn_dump_group_lowered_func
、FLAGS_cinn_dump_group_source_code
、FLAGS_ cinn_dump_group_ptx
、FLAGS_ cinn_dump_group_instruction
,可分别按fusion_groups
储存编译的每个阶段中的中间代码graph_visualization
,所有的可视化图、单测代码均能正确分组储存MakeDirectory
不能正确创建文件夹的问题