CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[Auto Parallel] add pipeline_parallel api #69217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
@@ -233,7 +250,7 @@ def run_llama(self, to_static=0): | |||
model, optimizer = self.parallel_model(model, optimizer) | |||
|
|||
criterion = LlamaPretrainingCriterion(self.config) | |||
criterion = self.parallel_model(criterion) | |||
# criterion = self.parallel_model(criterion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为啥注释了,是有冲突还是只因为目前这个没啥用?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没啥用,loss_func不用做并行化
), f"layer name:{name} not in the model, please check the split_spec" | ||
return name_to_layer[name] | ||
|
||
def get_mesh(pp_index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个函数能放到base类里面么?mp可能需要调用这个
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Args: | ||
model (paddle.nn.Layer): A single card model to be distributed | ||
optimizer (paddle.optimizer.Optimizer): An optimizer to be distributed | ||
split_spec (OrderedDict): Pipeline parallel split point, the order of the keys \u200bis the order of the pipeline stage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of \u200b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
打错字了,我改一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
split_spec = OrderedDict( | ||
[ | ||
( | ||
f"llama.layers.{i * decoders_per_rank - 1}", | ||
SplitPoint.END, | ||
) | ||
for i in range(1, self.pp) | ||
] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be it's better to use tuple of list, we can discuss it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
Auto Parallel
PR Types
Others
Description
Pcard-73145
添加pipeline自动并行中层API