support llama2-7b model run in sharding case. #65606

winter-wang · 2024-07-01T11:09:19Z

PR Category

Auto Parallel

PR Types

Bug fixes

Description

support llama2-7b model run in sharding case.

move add_n_op from manual op to yaml.
add inferspmd function for pd_op.add_n.
add inferspmd function for pd_op.split.
add inferspmd function for pd_op.softmax_grad.

Other

Pcard-67164

paddle-bot · 2024-07-01T11:09:23Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

JZ-LIANG · 2024-07-04T04:12:10Z

paddle/phi/infermeta/spmd_rules/add_n.cc

+#include "paddle/phi/infermeta/spmd_rules/utils.h"
+
+namespace phi::distributed {
+SpmdInfo AddNInferSpmd(


变长输入在 paddle/phi/infermeta/spmd_rules/utils.h 中有对应的模板实现，看一下是否能适用？

add_n的输入并不是变长，他是一个TensorList. 目前似乎也没有其它的算子适应这个spmd函数。感觉实现为变长模板似乎没有啥意义

zhiqiu

LGTM

zhiqiu · 2024-07-04T08:51:42Z

paddle/phi/infermeta/spmd_rules/add_n.cc

+    const std::vector<phi::distributed::DistMetaTensor>& inputs) {
+  auto N = inputs.size();
+  PADDLE_ENFORCE_GT(
+      N, 0, phi::errors::InvalidArgument("The inputs tensor's size of AddNOp"));


these error msg is not complete

done,thanks!

zhiqiu

LGTM

zyfncg · 2024-07-05T03:02:49Z

paddle/fluid/pir/dialect/operator/ir/manual_op.cc

 bool AddN_Op::InferSymbolicShape(
    pir::InferSymbolicShapeContext *infer_context) {
-  return details::AddNOpInferSymbolicShape(this->operation(), infer_context);
+  return AddNOpInferSymbolicShape(this->operation(), infer_context);
 }


为啥留下了AddN_?

AddN_里面的inplace特性目前的yaml体系无法描述。所以yaml里面只定义了AddN， AddN_仍然是manual的。

zyfncg · 2024-07-05T03:05:11Z

paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/element_wise_binary.cc

@@ -110,6 +110,45 @@ bool AddOpInferSymbolicShape(pir::Operation *op,
      [](const symbol::DimExpr &x, const symbol::DimExpr &y) { return x + y; });
 }

+bool AddNOpInferSymbolicShape(pir::Operation *op,


函数放的文件位置有些不太匹配，AddN不算是binary elementwise类型，更偏向是mutinary类型

done, thanks!

zhiqiu

LGTM

winter-wang force-pushed the develop branch 3 times, most recently from e571d9c to 1e4b209 Compare July 3, 2024 07:58

winter-wang requested review from LiYuRio, ForFishes and zhiqiu as code owners July 3, 2024 07:58

winter-wang force-pushed the develop branch 4 times, most recently from bd309ba to 524f0a5 Compare July 3, 2024 12:26

winter-wang changed the title ~~fix bug in pir dist type.~~ support llama2-7b model run in sharding case. Jul 3, 2024

winter-wang force-pushed the develop branch 2 times, most recently from 2565f90 to 1d69b8c Compare July 3, 2024 16:33

JZ-LIANG reviewed Jul 4, 2024

View reviewed changes

winter-wang added 2 commits July 4, 2024 16:23

fix bug in pir dist type.

d985a97

fix

a792562

winter-wang force-pushed the develop branch 2 times, most recently from f13b724 to a6b93e9 Compare July 4, 2024 08:29

zhiqiu previously approved these changes Jul 4, 2024

View reviewed changes

add inferspmd function for add_n.

53e3df5

winter-wang dismissed zhiqiu’s stale review via 53e3df5 July 4, 2024 09:56

winter-wang force-pushed the develop branch from a6b93e9 to 53e3df5 Compare July 4, 2024 09:56

zhiqiu previously approved these changes Jul 4, 2024

View reviewed changes

zyfncg reviewed Jul 5, 2024

View reviewed changes

fix review comment.

a58d47d

winter-wang dismissed zhiqiu’s stale review via a58d47d July 5, 2024 03:11

zyfncg approved these changes Jul 5, 2024

View reviewed changes

zhiqiu approved these changes Jul 5, 2024

View reviewed changes

winter-wang merged commit e7b16d7 into PaddlePaddle:develop Jul 5, 2024
31 of 32 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support llama2-7b model run in sharding case. #65606

support llama2-7b model run in sharding case. #65606

Uh oh!

winter-wang commented Jul 1, 2024 •

edited

Loading

Uh oh!

paddle-bot bot commented Jul 1, 2024

Uh oh!

JZ-LIANG Jul 4, 2024

Uh oh!

winter-wang Jul 4, 2024 •

edited

Loading

Uh oh!

zhiqiu left a comment

Uh oh!

zhiqiu Jul 4, 2024

Uh oh!

winter-wang Jul 4, 2024

Uh oh!

zhiqiu left a comment

Uh oh!

zyfncg Jul 5, 2024

Uh oh!

winter-wang Jul 5, 2024

Uh oh!

zyfncg Jul 5, 2024

Uh oh!

winter-wang Jul 5, 2024

Uh oh!

zhiqiu left a comment

Uh oh!

Uh oh!

Uh oh!

support llama2-7b model run in sharding case. #65606

support llama2-7b model run in sharding case. #65606

Uh oh!

Conversation

winter-wang commented Jul 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Other

Uh oh!

paddle-bot bot commented Jul 1, 2024

Uh oh!

JZ-LIANG Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

winter-wang Jul 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhiqiu left a comment

Choose a reason for hiding this comment

Uh oh!

zhiqiu Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

winter-wang Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

zhiqiu left a comment

Choose a reason for hiding this comment

Uh oh!

zyfncg Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

winter-wang Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

zyfncg Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

winter-wang Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

zhiqiu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

winter-wang commented Jul 1, 2024 •

edited

Loading

winter-wang Jul 4, 2024 •

edited

Loading