[XPU] support flash attention for xpu when causal is false #63644

runzhech · 2024-04-18T02:53:11Z

PR Category

Operator Mechanism

PR Types

New features

Description

support flash attention for xpu when causal is false，and enable pass bias as attn_mask

paddle-bot · 2024-04-18T02:53:16Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

runzhech · 2024-04-22T08:37:46Z

paddle/phi/kernels/xpu/flash_attn_grad_kernel.cc

-
+  const float* bias_data = nullptr;
+  if (attn_mask.get_ptr() != nullptr) {
+    bias_data = attn_mask->data<float>();


由于fa接口中暂不支持attn_mask，因此通过attn_mask获取数据并作为bias传入

runzhech · 2024-04-22T08:37:59Z

paddle/phi/kernels/xpu/flash_attn_kernel.cc

@@ -246,6 +243,10 @@ void FlashAttnKernel(const Context& ctx,
  XPUType* out_data = reinterpret_cast<XPUType*>(out->data<T>());
  float* softmax_lse_data = softmax_lse->data<float>();

+  const float* bias_data = nullptr;
+  if (attn_mask.get_ptr() != nullptr) {
+    bias_data = attn_mask->data<float>();


runzhech · 2024-04-22T08:38:33Z

paddle/phi/kernels/xpu/flash_attn_grad_kernel.cc

@@ -39,9 +39,6 @@ void FlashAttnGradKernel(const Context& ctx,
                         DenseTensor* dk,
                         DenseTensor* dv) {
 #ifdef PADDLE_WITH_XPU_XHPC


目前fa已支持is_causal=false

ZibinGuo

LGTM

XiaociZhang

LGTM

XiaociZhang · 2024-04-22T10:33:11Z

paddle/phi/kernels/xpu/flash_attn_grad_kernel.cc

@@ -121,7 +121,11 @@ void FlashAttnGradKernel(const Context& ctx,
      num_heads_k,                  // head_num_k
      head_size,                    // head_dim
      1.0f / std::sqrt(head_size),  // softmax_scale
-      dropout                       // p_dropout
+      dropout,                      // p_dropout
+      0x45678901,                   // seed


是否提供一种控制seed的方式？

已根据GPU的实现补上了这个功能

houj04 · 2024-04-23T02:36:34Z

paddle/phi/kernels/xpu/flash_attn_kernel.cc

 #include "paddle/phi/backends/xpu/enforce_xpu.h"
+#include "paddle/phi/core/enforce.h"


这个不需要吧？因为paddle/phi/backends/xpu/enforce_xpu.h里面包括了paddle/phi/core/enforce.h

houj04

下一个版本需要做：
1、不需要的include干掉
2、增加对应的单测

…dle#63644) * [XPU] support flash attention when causal=false * support generate seed from input params like gpu

runzhech force-pushed the bind_xfa branch from 2d384f0 to d548886 Compare April 18, 2024 03:10

runzhech closed this Apr 22, 2024

runzhech reopened this Apr 22, 2024

[XPU] support flash attention when causal=false

d9e054d

runzhech force-pushed the bind_xfa branch from 46f767a to d9e054d Compare April 22, 2024 07:33

runzhech commented Apr 22, 2024

View reviewed changes

ZibinGuo approved these changes Apr 22, 2024

View reviewed changes

XiaociZhang reviewed Apr 22, 2024

View reviewed changes

support generate seed from input params like gpu

4a595d1

houj04 reviewed Apr 23, 2024

View reviewed changes

houj04 approved these changes Apr 23, 2024

View reviewed changes

houj04 merged commit 20481c7 into PaddlePaddle:develop Apr 23, 2024

runzhech deleted the bind_xfa branch April 23, 2024 02:48

houj04 mentioned this pull request Apr 23, 2024

[XPU] enable Generator::IncrementOffset #63792

Merged

co63oc pushed a commit to co63oc/Paddle that referenced this pull request Apr 25, 2024

[XPU] support flash attention for xpu when causal is false (PaddlePad…

5c2f09e

…dle#63644) * [XPU] support flash attention when causal=false * support generate seed from input params like gpu

houj04 added the XPU label Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XPU] support flash attention for xpu when causal is false #63644

[XPU] support flash attention for xpu when causal is false #63644

Uh oh!

runzhech commented Apr 18, 2024 •

edited

Loading

Uh oh!

paddle-bot bot commented Apr 18, 2024

Uh oh!

runzhech Apr 22, 2024

Uh oh!

runzhech Apr 22, 2024

Uh oh!

runzhech Apr 22, 2024

Uh oh!

ZibinGuo left a comment

Uh oh!

XiaociZhang left a comment

Uh oh!

XiaociZhang Apr 22, 2024

Uh oh!

runzhech Apr 23, 2024

Uh oh!

houj04 Apr 23, 2024

Uh oh!

houj04 left a comment

Uh oh!

Uh oh!

		#include "paddle/phi/backends/xpu/enforce_xpu.h"
		#include "paddle/phi/core/enforce.h"

[XPU] support flash attention for xpu when causal is false #63644

[XPU] support flash attention for xpu when causal is false #63644

Uh oh!

Conversation

runzhech commented Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented Apr 18, 2024

Uh oh!

runzhech Apr 22, 2024

Choose a reason for hiding this comment

Uh oh!

runzhech Apr 22, 2024

Choose a reason for hiding this comment

Uh oh!

runzhech Apr 22, 2024

Choose a reason for hiding this comment

Uh oh!

ZibinGuo left a comment

Choose a reason for hiding this comment

Uh oh!

XiaociZhang left a comment

Choose a reason for hiding this comment

Uh oh!

XiaociZhang Apr 22, 2024

Choose a reason for hiding this comment

Uh oh!

runzhech Apr 23, 2024

Choose a reason for hiding this comment

Uh oh!

houj04 Apr 23, 2024

Choose a reason for hiding this comment

Uh oh!

houj04 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

runzhech commented Apr 18, 2024 •

edited

Loading