CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Support float8 data type #64735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support float8 data type #64735
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
Wanglongzhi2001 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
这里的文件 是不是能拆分合入下 |
cudaGetDevice(&dev); | ||
if (dev == 0) { | ||
std::ofstream outfile; | ||
outfile.open(config_filename_, std::ios::out | std::ios::trunc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为啥要在析构函数里写文件?
infile.close(); | ||
} | ||
|
||
std::string config_filename_{"/tmp/paddle_cublaslt_cache"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里采用全局唯一的命名,如果多实例运行怎么办?
std::string config_filename_{"/tmp/paddle_cublaslt_cache"}; | ||
std::unordered_map<int64_t, cublasLtMatmulAlgo_t> map_; | ||
int search_times_; | ||
const int requested_algo_count_ = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些参数建议通过环境变量传入
好的,已拆分 |
paddle/fluid/pybind/tensor.cc
Outdated
} else if (dst->place() == phi::CPUPlace() && place == phi::GPUPlace()) { | ||
cudaMemcpy( | ||
dst->Holder()->ptr(), src.data(), src.size(), cudaMemcpyDeviceToHost); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 其他TensorCopyFrom接口不能用的原因是?
- 需要实现else分支,如果没有else if分支改成else,并判断place设置的正确性
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个api在开发阶段组网用的,最终发现不需要了,已删掉
#if CUDNN_VERSION_MIN(8, 6, 0) && CUDA_VERSION >= 11800 | ||
case phi::DataType::FLOAT8_E4M3FN: | ||
type = CUDNN_DATA_FP8_E4M3; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里不支持 float8_e5m2 类型吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.nvidia.com/deeplearning/cudnn/latest/developer/graph-api.html#id33
ConvolutionFwd上建议E4M3
这里也加上E5M2,给后续的功能使用
def config(self): | ||
self.dtype = 'float8_e4m3fn' | ||
self.rtol = 0.6 | ||
self.atol = 7.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里rtol、atol这么大?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改小
paddle/fluid/platform/e4m3.h
Outdated
@@ -0,0 +1,24 @@ | |||
/* Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件名有问题吧,需要加上float_前缀?float8_e4m3fn.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
paddle/fluid/platform/e5m2.h
Outdated
@@ -0,0 +1,24 @@ | |||
/* Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文件名改为float8_e5m2.h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
paddle/fluid/pybind/tensor.cc
Outdated
@@ -67,6 +67,7 @@ limitations under the License. */ | |||
#include "paddle/fluid/imperative/amp_auto_cast.h" | |||
#include "paddle/fluid/imperative/layer.h" | |||
#include "paddle/fluid/memory/allocation/allocator_strategy.h" | |||
#include "paddle/fluid/memory/memcpy.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个头文件是否是必须的?感觉可以删掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以去掉
paddle/fluid/pybind/tensor_py.h
Outdated
|
||
static std::string format() { | ||
// Note: "E4M3" represents float8_e4m3fn. | ||
return "c"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return "c" 还是 "E4M3FN"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
收到
paddle/fluid/pybind/tensor_py.h
Outdated
|
||
static std::string format() { | ||
// Note: "E5M2" represents float8_e5m2. | ||
return "E4M3"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return "E4M3" 还是 "E5M2" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
paddle/fluid/pybind/tensor_py.h
Outdated
|
||
static std::string format() { | ||
// Note: "E5M2" represents float8_e5m2. | ||
return "E4M3"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why E5M2's format is "E4M3"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改~
@@ -61,6 +61,8 @@ static DataTypeMap* InitDataTypeMap() { | |||
_ForEachDataType_(RegType); | |||
// Register pstring individually | |||
RegType(pstring, proto::VarType::PSTRING); | |||
RegType(::paddle::platform::float8_e5m2, proto::VarType::FP8_E5M2); | |||
RegType(::paddle::platform::float8_e4m3fn, proto::VarType::FP8_E4M3FN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, what does "FN" stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
表示是否支持NaN、0、Inf, 与ONNX格式相同:https://onnx.ai/onnx/technical/float8.html#e4m3fn-and-e5m2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for full/cast kernel add more dtype but full_grad/cast_grad have not registered these dtypes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,后续在 API 的 doctoring 里,和 dcos 仓库里对应的 API 中文文档中,增加一下新增数据类型支持的描述,以便用户可以感知
* Support float8 data type
PR Category
Performance Optimization
PR Types
New features
Description
pcard-71500