CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
【Hackathon 6th No.25】[Typing] 为 paddle.histogram 进行功能对齐与功能增强 -part #63346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
关于Weight参数的兼容性,是需要参考这篇文档中, 向 |
@AndPuQing 如果是在原API的最后面添加默认参数,且默认值也不会改变原有行为的,那么就不会有不兼容问题,无需修改其他位置。 因此你在修改的过程中,需要进行兼容性的设计,确保之前的代码运行时不会失败(语法失败、结果改变)。 |
… no23-histogram
paddle/phi/api/yaml/ops.yaml
Outdated
@@ -1351,8 +1351,9 @@ | |||
backward : heaviside_grad | |||
|
|||
- op : histogram | |||
args : (Tensor input, int64_t bins = 100, int min = 0, int max = 0) | |||
args : (Tensor input, Tensor weight, int64_t bins = 100, int min = 0, int max = 0, bool density = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个weight并没有放到最后面,你这个是不兼容的修改方式
python/paddle/tensor/linalg.py
Outdated
@@ -2190,21 +2190,26 @@ def bmm(x, y, name=None): | |||
return out | |||
|
|||
|
|||
def histogram(input, bins=100, min=0, max=0, name=None): | |||
def histogram( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你这个weight要放到最后面,进行兼容性的修改。
一个接口前面突然插入一个参数,导致所有参数位置发生改变,是一种不兼容的修改。
比如以前某行代码调用时没有指定关键字:paddle.histogram(inputs, 4, 0, 3)
,前面突然插个参数就会直接挂了,放后面就没有这个问题
是的,需要修改op_version.yaml |
Sorry to inform you that 4e8b5ac's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
python/paddle/tensor/linalg.py
Outdated
else: | ||
helper = LayerHelper('histogram', **locals()) | ||
check_variable_and_dtype( | ||
input, 'X', ['int32', 'int64', 'float32', 'float64'], 'histogram' | ||
) | ||
out = helper.create_variable_for_type_inference(VarDesc.VarType.INT64) | ||
|
||
if density or weight: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是不是 if weight:
Sorry to inform you that ef946d9's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
python/paddle/tensor/linalg.py
Outdated
@@ -2205,10 +2207,13 @@ def histogram(input, bins=100, min=0, max=0, name=None): | |||
bins (int, optional): number of histogram bins. Default: 100. | |||
min (int, optional): lower end of the range (inclusive). Default: 0. | |||
max (int, optional): upper end of the range (inclusive). Default: 0. | |||
weight (Tensor, optional): If provided, it must have the same shape as input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里除了需要写weight的形状外,还需要写weight的功能
float gap = static_cast<float>(nbins) / | ||
static_cast<float>((output_max - output_min)) / *sum_data; | ||
for (int64_t i = 0; i < nbins; i++) { | ||
out_data[i] *= gap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个公式不是直接除以sum_data吗,上面这个gap的写法不太好看出公式,优化下写法
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里我是提前除以sum_data,然后就不用在每个for 循环中多一个除操作
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里我是提前除以sum_data,然后就不用在每个for 循环中多一个除操作
这里再乘以 static_cast<float>(nbins) / static_cast<float>((output_max - output_min))
是因为?公式没看明白
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为连续型概率分布密度函数 (probability density function) 需要保证:
即密度函数与坐标轴围成面积为
这里的
有别于在离散型随机变量上定义的概率质量函数 (probability mass function):
二者的区别是,概率质量函数每个点表示该点的概率值,而分布密度函数得在区间做积分才能得出概率。
所以在计算时,在得出每个区间的
所以代码就有
float* sum_data = sum.data<float>();
float gap = static_cast<float>((output_max - output_min)) /
static_cast<float>(nbins);
for (int64_t i = 0; i < nbins; i++) {
out_data[i] /= (gap * *sum_data)
}
考虑到除法相比于乘法更慢,所以变形了一下。
float* sum_data = sum.data<float>();
float gap = static_cast<float>(nbins) /
static_cast<float>((output_max - output_min)) / *sum_data;
for (int64_t i = 0; i < nbins; i++) {
out_data[i] *= gap;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
麻烦修改一下
float gap = static_cast<float>(nbins) / | ||
static_cast<float>((output_max - output_min)) / *sum_data; | ||
for (int64_t i = 0; i < nbins; i++) { | ||
out_data[i] *= gap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里我是提前除以sum_data,然后就不用在每个for 循环中多一个除操作
这里再乘以 static_cast<float>(nbins) / static_cast<float>((output_max - output_min))
是因为?公式没看明白
paddle/phi/infermeta/unary.cc
Outdated
@@ -2072,8 +2072,30 @@ void HashInferMeta(const MetaTensor& x, | |||
out->set_dtype(x.dtype()); | |||
} | |||
|
|||
void HistogramInferMeta( | |||
const MetaTensor& input, int64_t bins, int min, int max, MetaTensor* out) { | |||
void HashInferMeta(const MetaTensor& x, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的修改是不是commit弄错了,将其他commit内容弄过来了
} | ||
} | ||
if (density) { | ||
DenseTensor sum = phi::Sum<float, Context>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对output所有元素求和了之后得到sum,再output逐个除以sum,得到的值不是概率密度吗
这个输出结果和pytorch的结果进行了对比检查没
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
先合入,后面我们再通过Pytorch代码转换验证一遍结果。
|
paddle/phi/infermeta/unary.h
Outdated
|
||
void HistogramInferMeta( | ||
const MetaTensor& input, int64_t bins, int min, int max, MetaTensor* out); | ||
void HistogramInferMeta(const MetaTensor& input, | ||
const MetaTensor& weight, | ||
int64_t bins, | ||
int min, | ||
int max, | ||
bool density, | ||
MetaTensor* out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
新增参数之后属于binary类型算子,需要换个文件放了
python/paddle/tensor/linalg.py
Outdated
min: float = 0, | ||
max: float = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这两个到底是 float 还是 int?为什么文档和类型提示不一致?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
实现也要移动吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请修改对应的中文文档
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
User Experience
PR Types
New features
Description
黑客松 25 题拆分PR