CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[XPU] fix bugs of depthwise conv test and change default quant type #70859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
{"XPU_PADDLE_FC_INT32_WITH_LL", XPUFCCalcType::FC_INT32_WITH_LL}, | ||
}; | ||
#ifdef PADDLE_WITH_XPU_XRE5 | ||
auto default_calc_type = XPUFCCalcType::FC_FLOAT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TF32下目前会有一大批XPU单测跑不过,因此对FP32类型目前先默认使用FP32量化
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
下个pr在这里加个TODO的注释吧,否则之后容易忘
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -41,29 +42,60 @@ enum XPUFCCalcType { | |||
FC_FLOAT16, | |||
}; | |||
|
|||
template <typename T> | |||
XPUFCCalcType FCCalcType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
原版在确定量化类型的时候,是按照这一大堆if-else-if-else的逻辑,顺序执行的。按照新的写法,换成了unordered_map,会不会导致某些非常特殊的情况下(比如,同时刷了多个环境变量),行为发生变化?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用unordered_map确实为造成不同环境变量之间优先级不明确的问题,要不下个PR我吧unordered_map替换成vector,从而能够明确环境变量之间的优先级。
此外之前的那套环境变量选择的逻辑实际上也是有问题的,比如BF16类型可能选到int32等类型的量化,但是这些量化类型BF16根本不支持,可能会踩到unimplemented。所以新版本即使固定了优先级可能也不会和之前完全一样
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
收到。所以借这个机会能清理出一套比较明确的优先级和判定策略也是个好事,不必和之前的完全一致。
Sorry to inform you that 2fdd753's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
2fdd753
to
ff36639
Compare
ff36639
to
9598069
Compare
@@ -362,6 +363,20 @@ def wrapper(cls): | |||
return wrapper | |||
|
|||
|
|||
@contextlib.contextmanager | |||
def xpu_matmul_quant_type_guard(dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没太理解到这个的作用,为啥只关注了XPU_PADDLE_FC_FLOAT这个环境变量,以及其它对单测的修改,什么时候需要套这个装饰器,什么时候不套它呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个主要是考虑到像类似FA的单测,FC仅仅用来算baseline,与要测的算子精度无关。这里选择套一个这个guard,让计算baseline的时候全部使用最高精度的量化方式,可以不用改单测阈值。
目前只关注了FP32量化,后续看情况这个guard也可以指定其他量化类型
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…addlePaddle#70859) * [XPU] fix bugs of depthwise conv test and change default quant type * fix typo * change default quant to float for fp32 * fp32 use tf32 * fix some ci bugs * fix more ci bugs
PR Category
Custom Device
PR Types
Bug fixes
Description