CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
支持非均衡VPP编排的灵活模型层分配策略 #70230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
支持非均衡VPP编排的灵活模型层分配策略 #70230
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要补充说明一下实现方案
# Step2: analysis whether the pp_stage is non-decreasing among segments | ||
# 1. if non_use_custom_mesh is True, the ops' process_mesh will be changed by vpp strategy | ||
# 2. if non_use_custom_mesh is False, the ops's process_mesh will not be changed. | ||
non_use_custom_mesh = _analyze_use_custom_mesh(ops, seg_method, pp_degree) | ||
|
||
# Step3: Get op index boundary, pp_stage, chunk_id, struct_names of each segment | ||
if len(seg_struct_names) % num_chunks == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用户标记不同stage不同layer数,是否也可能出现len(seg_struct_names) % num_chunks == 0
的情况?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改,适配所有情况
def get_user_layer_to_mesh(ops, seg_method, pp_degree, segment_nums): | ||
pp_stage_list = [] | ||
for op in ops: | ||
if _extract_seg_method(op, seg_method) and "pd_op" in op.name(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为什么要限制"pd_op" in op.name()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pp_stage_list.append( | ||
get_pp_stage_by_process_mesh(op_mesh, pp_degree) | ||
) | ||
per_segment_op_nums = len(pp_stage_list) // segment_nums |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如何理解这条公式?per_segment_op_nums
表示每个layer中的op数量?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对的,每层layer中的op数
|
||
user_layer_to_mesh = [ | ||
sum(pp_stage_list[i : i + per_segment_op_nums]) | ||
// len(pp_stage_list[i : i + per_segment_op_nums]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(pp_stage_list[i : i + per_segment_op_nums])
是不是等价于per_segment_op_nums
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user_layer_to_stage_id = ( | ||
[] | ||
) # User intent - a list corresponding to the model layer and the device. The key of the list is the number of the layer, and the value is the corresponding pp_stage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user_layer_to_stage_id = ( | |
[] | |
) # User intent - a list corresponding to the model layer and the device. The key of the list is the number of the layer, and the value is the corresponding pp_stage. | |
stage_ids=[] |
stage_ids[i]表示第i层分配的stage编号,这样是否更简洁一些?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
for idx, op in enumerate(ops): | ||
if len(seg_parts) == len(seg_struct_names): | ||
break | ||
struct_name = _extract_seg_method(op, seg_method) | ||
if ( | ||
"pd_op" in op.name() and last_struct_name != struct_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if op.name() not in skip_op_list
可读性更强些
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"pd_op" in op.name() and last_struct_name != struct_name | ||
): # When traversing the ops, filter out the ops that need to be skipped. At the same time, according to the struct_name, ensure that the pp_stage of each layer is only recorded once. | ||
last_struct_name = struct_name | ||
user_layer_to_stage_id.extend(op.dist_attr.process_mesh.process_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为何不是一个自增的stage_id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pp_stage_layer_num[i] = pp_stage_layer_num[i] + 1 | ||
assert all( | ||
value >= vpp_degree for value in pp_stage_layer_num | ||
), "Make sure each segment is not empty" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个提示信息和拦截条件不匹配
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pp_stage_layer_nums = pp_stage_layer_num[pp_stage] | ||
for i in range( | ||
0, pp_stage_layer_nums | ||
): # The pp_stage uses a Round robin scheduling algorithm to allocate layers one by one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
较长的注释直接单独成一行
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Sorry to inform you that 4f7e586's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
for idx, op in enumerate(ops): | ||
if len(seg_parts) == len(seg_struct_names): | ||
break | ||
struct_name = _extract_seg_method(op, seg_method) | ||
if op.dist_attr is not None and last_struct_name != struct_name: | ||
# When traversing the operations, filter out those without any ops where `has_attr` is `None`. At the same time, ensure that the `pp_stage` of each layer is recorded only once according to the `struct_name`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这条注释只是把代码直白地翻译一遍,没有必要,注释应该是写无法直观从代码中获取到的信息。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已剔除
if op.dist_attr is not None and last_struct_name != struct_name: | ||
# When traversing the operations, filter out those without any ops where `has_attr` is `None`. At the same time, ensure that the `pp_stage` of each layer is recorded only once according to the `struct_name`. | ||
if ( | ||
get_pp_stage_by_process_mesh( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个写法get_pp_stage_by_process_mesh
的逻辑会被重复调用两遍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
for pp_stage in range( | ||
0, pp_degree | ||
): # Each pp_stage is assigned a number of tiers based on user intent. | ||
pp_stage_layer_nums = pp_stage_layer_num[pp_stage] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pp_stage_layer_nums = pp_stage_layer_num[pp_stage] | |
pp_stage_layer_num = pp_stage_layer_nums[pp_stage] |
变量命名要符合语法逻辑,比如加s表示复数
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
assert all( | ||
value >= vpp_degree for value in pp_stage_layer_num | ||
), "The number of layers on each pp_stage must not be less than the vpp_degree in the pp_stage to ensure that each chunk contains at least one layer." | ||
seg_layer_num = [0] * num_chunks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seg_layer_num = [0] * num_chunks | |
# Each chunk is assigned a number of layers based on user intent. | |
seg_layer_num = [0] * num_chunks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用户在使用vpp的时候一般不会深入到chunk部分,所以这里不会按照用户意图去分配chunk中的layer数,而是按照用户意图分配每个pp_stage中的layer数,chunk中的layer数由底层代码在每个pp_stage中,按照Round-Robin算法自动分配。
pp_stage_layer_nums = pp_stage_layer_num[pp_stage] | ||
for i in range(0, pp_stage_layer_nums): | ||
# The pp_stage uses a Round robin scheduling algorithm to allocate layers one by one. | ||
v_chunk_id = i % vpp_degree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v
和r
是什么的缩写?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
一个是virtual虚拟chunk_id,一个是real真实chunk_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
Auto Parallel
PR Types
Others
Description
这里选择hidden_layer=8,layer_to_mesh=[mesh0,mesh0,mesh0,mesh0,mesh0,mesh1,mesh1,mesh1](即0设备5层,1设备3层)
,运行发现,vpp编排能正常进行,但是编排方式是假灵活分配,如下图所示:
仍然把模型层数均匀分配到了每一个设备上,因此本项工作对上述内容进行了优化,优化效果如下图所示:
可以看到此时在第0,1,2层hidden_layer分配在0号设备,第3,4层分配在1号设备,第5,6层分配在0号设备,第7层分配在1号设备,与用户的layer_to_mesh(即0设备5层,1设备3层),保持一致。