[API] fix paddle.diag with big tensor #72638

ZhangX-21 · 2025-05-09T08:21:51Z

PR Category

Execute Infrastructure

PR Types

Bug fixes

Description

paddle.diag support big tensor shape
Pcard-88155
性能回退1%

paddle-bot · 2025-05-09T08:21:55Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

lshpku · 2025-05-19T02:14:54Z

paddle/phi/kernels/gpu/diag_kernel.cu

+                                    int64_t x_length,
+                                    const int64_t sumStride,
+                                    const int64_t xStride) {
+  for (int64_t idx = blockIdx.x * blockDim.x + threadIdx.x; idx < x_length;


这里blockIdx.x * blockDim.x会超过uint32吗？

blockIdx.x * blockDim.x，取值范围不超过uint32, 但是idx < x_length在大Tensor场景下，所以需要采用int64_t的取值范围。

lshpku · 2025-05-19T02:16:32Z

paddle/phi/infermeta/unary.cc

@@ -931,10 +931,10 @@ void DiagInferMeta(const MetaTensor& x,
  auto x_dims = x.dims();

  if (x_dims.size() <= 1) {
-    int64_t size_ = (x_dims.size() == 1UL ? x_dims[0] : 1) + std::abs(offset);
+    int64_t size_ = (x_dims.size() == 1ULL ? x_dims[0] : 1) + std::abs(offset);


这里写1ULL和直接写1有什么区别

这个没有什么区别，只是修改支持unsigned long long int的取值。

* refine forrange (#72360) * refine forrange * refine forrange * reduce support big tensor (#71970) * reduce support big tensor * [PHI] Fix gridDim limit for reduce kernel (#72507) * [API] isclose support bigtensor (#72516) * isclose support bigtensor * refine * [API] isnan isinf isfinite support bigtensor (#72517) * isnan isinf isfinite support bigtensor * refine * [PHI] Fix cum kernel for big tensor (#72562) * [PHI] Preliminary fix for elementwise broadcast int32 shape overflow (#72584) * [PHI] Align linalg.solve kernel with torch (#72608) * Update strided copy kernel (#72662) * [PHI] Fix grid sample kernel for big tensor (#72628) * [PHI] Fix argsort big tensor bug (#72712) * [PHI] Fixed argsort big tensor bug * [PHI] Fixed shape mismatch problem. * [PHI] Fix contiguous kernel for big tensor (#72705) * [PHI] Fix flatten and split kernel for big tensor (#72634) * [PHI] Fix out-of-bound issue of paddle.take_along_axis (#72757) * [PHI] fix paddle.diag with big tensor (#72638) * [API] fix paddle.cross with big tensor (#72652) * [PHI] Fix paddle.where api for big tensor (#72717) * [PHI] Fix bincount kernel for big tensor (#72706) * fix bincount kernel for big tensor * use HostAlloc to alloc memory * add cpu test case * [PHI] Fix full_like kernel for big tensor (#72831) * [API] Fix int overflow and float16 support for paddle.frac (#72815) * [PHI] Align paddle.inner with torch in matmul logic (#72843) * [PHI] Fix paddle.var & paddle.std float16 overflow (#72650) * [PHI] Fix logsumexp precision problem (#72681) * [PHI] Debug for logsumexp, bug source found * [PHI] Removed GetNumBlocks func to get correct logsumexp * [PHI] Removed redundant debug VLOG * [PHI] Elegant grid bounded solution * [Accuracy diff No.55-56、76-77] Fix accuracy diff for var&std API (#72879) * [Accuracy diff No.21] Fix accuracy diff for heaviside API (#72894) --------- Co-authored-by: Shuhao Liang <50269654+lshpku@users.noreply.github.com> Co-authored-by: Qianyue He <46109954+Enigmatisms@users.noreply.github.com> Co-authored-by: Lei Ding <69283446+Dmovic@users.noreply.github.com> Co-authored-by: ggggxm <66855582+ggggxm@users.noreply.github.com> Co-authored-by: xkkkkkk23 <xiekeke@baidu.com> Co-authored-by: Zx <zhangxiao35@baidu.com> Co-authored-by: huangjiyi <43315610+huangjiyi@users.noreply.github.com> Co-authored-by: ooo oo <106524776+ooooo-create@users.noreply.github.com>

ZhangX-21 force-pushed the fix_diag_big_tensor branch from cf9638b to a4b14a4 Compare May 9, 2025 09:01

ZhangX-21 changed the title ~~[PHI] fix paddle.diag with big tensor~~ [API] fix paddle.diag with big tensor May 9, 2025

ZhangX-21 force-pushed the fix_diag_big_tensor branch 3 times, most recently from 80aec86 to b315d45 Compare May 14, 2025 13:58

[PHI] fix paddle.diag with big tensor

a4a9598

ZhangX-21 force-pushed the fix_diag_big_tensor branch from b315d45 to a4a9598 Compare May 14, 2025 15:53

lshpku reviewed May 19, 2025

View reviewed changes

lshpku approved these changes May 19, 2025

View reviewed changes

lshpku merged commit 7940a61 into PaddlePaddle:develop May 19, 2025
49 of 50 checks passed

wanghuancoder pushed a commit to wanghuancoder/Paddle that referenced this pull request May 27, 2025

[PHI] fix paddle.diag with big tensor (PaddlePaddle#72638)

63ce668

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[API] fix paddle.diag with big tensor #72638

[API] fix paddle.diag with big tensor #72638

Uh oh!

ZhangX-21 commented May 9, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented May 9, 2025

Uh oh!

lshpku May 19, 2025

Uh oh!

ZhangX-21 May 19, 2025 •

edited

Loading

Uh oh!

lshpku May 19, 2025

Uh oh!

ZhangX-21 May 19, 2025

Uh oh!

Uh oh!

Uh oh!

[API] fix paddle.diag with big tensor #72638

[API] fix paddle.diag with big tensor #72638

Uh oh!

Conversation

ZhangX-21 commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented May 9, 2025

Uh oh!

lshpku May 19, 2025

Choose a reason for hiding this comment

Uh oh!

ZhangX-21 May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lshpku May 19, 2025

Choose a reason for hiding this comment

Uh oh!

ZhangX-21 May 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ZhangX-21 commented May 9, 2025 •

edited

Loading

ZhangX-21 May 19, 2025 •

edited

Loading