CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
【Hackathon 7th No.19】NO.19为 Paddle 新增 load_state_dict_from_url API -part #68594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
@luotao1 已全部通过CI检测,只剩approval |
python/paddle/hapi/hub.py
Outdated
|
||
>>> import paddle | ||
>>> paddle.hub.hapi.load_state_dict_from_url('https://paddle-hapi.bj.bcebos.com/models/resnet18.pdparams', "/paddle/test_zty")#下载模型文件并加载 | ||
>>> paddle.hapi.hub.load_state_dict_from_url(url='https://127.0.0.1:9100/download/resnet18.zip', model_dir="/paddle/test_zty")#下载ZIP模型文件,解压并加载 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 不要有中文
- model_dir="/paddle/test_zty") test_zty 可以换成其他有意义的dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use paddle.hub.load_state_dict_from_url
instead of paddle.hub.hapi.load_state_dict_from_url
or paddle.hapi.hub.load_state_dict_from_url
in document to maintain consistency with the rfc
python/paddle/hapi/hub.py
Outdated
check_hash (bool, optional) – If True, the filename part of the URL should follow the naming convention filename-<sha256>.ext where <sha256> is the first eight or more digits of the SHA256 hash of the contents of the file. The hash is used to ensure unique names and to verify the contents of the file. Default: False | ||
file_name (str, optional) – name for the downloaded file. Filename from url will be used if not set. | ||
map_location (optional) - A function or dictionary that specifies how to remap storage locations. | ||
weights_only (bool, optional) - If True, only the weights will be loaded, not the complex serialized objects. Recommended for untrusted sources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
每一个参数都要测到,check_hash=True,map_location,以及 weights_only=True都没测到。
看后面的单侧,只测了 url 、file_name 和 model_dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- weights_only和map_location测试的话,需要修改paddle.load的代码,即这两个参数如果要用,是需要在paddle.load中加入并处理的(pytorch中是在torch.load中实现的),您看看是否需要我改paddle.load
- check_hash目前我看了一下,应该可以加一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以在 paddle.load 的configs 参数里加
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以在 paddle.load 的configs 参数里加
@luotao1
我今天整个看了一下paddle.load的api是对标torch.load开发的,但是上图显示,开发者当时没有选择保留weights_only和map_location两个参数,这两个参数的功能实现完全是在load函数需要实现的(实现还要调用其他api和函数),同时之前的开发者表示这里的configs参数未来也可能要遗弃,所以问问您是否确定在这里面加参数
- weights_only,如果需要加这个参数并处理,最重要的一步是需要添加一个:_weights_only_unpickler.py 的文件,这个文件中同时调用了多个torch的其它api接口,目前我还没看是否paddle都实现了,如果没实现,需要继续增加新的api,所以让您评估一下是否加这个参数,_weights_only_unpickler.py 的参考链接:https://github.com/pytorch/pytorch/blob/main/torch/_weights_only_unpickler.py
- map_location,这个最重要的是要新增一个default_restore_location函数,但是在函数中需要调用到各种deserializer functions即反序列化器函数,目前不知道paddle是否开发了相应的函数,还未调研,也需要您评估一下是否加这个参数,default_restore_location的参考链接https://pytorch.org/docs/2.5/_modules/torch/serialization.html#load
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest evaluating the cost of adding support for weights_only
and map_location
in paddle.load
to let us make a better decision on whether to add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
工作量比较大,也比较耗时
因此应该将load_state_dict_from_url中的这两个参数去掉,而不是由开发 load_state_dict_from_url API的人员去扩展paddle.load的功能
如果是工作量原因的话,我们可以将题目加🌟。因为其他题目过程中,也会发现工作量超过预估的情况。现在是要评估下技术上的方案,如果觉得原来 load API开发的时候存在不完善的情况,需要进行更新。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的好的,那我先评估一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, remove the weights_only
parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
class TestLoadStateDictFromUrl(unittest.TestCase): | ||
def setUp(self): | ||
self.model = resnet18(pretrained=False) | ||
self.weight_path = '/paddle/test_zty/test/resnet18.pdparams' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/paddle/test_zty/test/resnet18.pdparams
test_zty 可以换成其他有意义的dir
Sorry to inform you that e5b22f5's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
对应的中文文档可以提上来,可以参考下已经合入的PR |
python/paddle/hapi/hub.py
Outdated
model_dir=None, | ||
check_hash=False, | ||
file_name=None, | ||
map_location=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
添加类型提示
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
python/paddle/hapi/hub.py
Outdated
Args: | ||
url (str) – URL of the object to download | ||
model_dir (str, optional) – directory in which to save the object | ||
check_hash (bool, optional) – If True, the filename part of the URL should follow the naming convention filename-<sha256>.ext where <sha256> is the first eight or more digits of the SHA256 hash of the contents of the file. The hash is used to ensure unique names and to verify the contents of the file. Default: False | ||
file_name (str, optional) – name for the downloaded file. Filename from url will be used if not set. | ||
map_location (optional) - A function or dictionary that specifies how to remap storage locations. | ||
Returns: | ||
Object, an instance of an object that can be used in a paddle | ||
Examples: | ||
.. code-block:: python | ||
|
||
>>> import paddle | ||
>>> paddle.hub.load_state_dict_from_url(url='https://paddle-hapi.bj.bcebos.com/models/resnet18.pdparams', model_dir="./paddle/test_load_from_url") | ||
>>> paddle.hub.load_state_dict_from_url(url='https://x2paddle.bj.bcebos.com/resnet18.zip', model_dir="./paddle/test_file_is_zip") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Args: | |
url (str) – URL of the object to download | |
model_dir (str, optional) – directory in which to save the object | |
check_hash (bool, optional) – If True, the filename part of the URL should follow the naming convention filename-<sha256>.ext where <sha256> is the first eight or more digits of the SHA256 hash of the contents of the file. The hash is used to ensure unique names and to verify the contents of the file. Default: False | |
file_name (str, optional) – name for the downloaded file. Filename from url will be used if not set. | |
map_location (optional) - A function or dictionary that specifies how to remap storage locations. | |
Returns: | |
Object, an instance of an object that can be used in a paddle | |
Examples: | |
.. code-block:: python | |
>>> import paddle | |
>>> paddle.hub.load_state_dict_from_url(url='https://paddle-hapi.bj.bcebos.com/models/resnet18.pdparams', model_dir="./paddle/test_load_from_url") | |
>>> paddle.hub.load_state_dict_from_url(url='https://x2paddle.bj.bcebos.com/resnet18.zip', model_dir="./paddle/test_file_is_zip") | |
Args: | |
url (str): URL of the object to download | |
model_dir (str, optional): directory in which to save the object | |
check_hash (bool, optional): If True, the filename part of the URL should follow the naming convention filename-<sha256>.ext where <sha256> is the first eight or more digits of the SHA256 hash of the contents of the file. The hash is used to ensure unique names and to verify the contents of the file. Default: False | |
file_name (str, optional): name for the downloaded file. Filename from url will be used if not set. | |
map_location (optional): A function or dictionary that specifies how to remap storage locations. | |
Returns: | |
Object, an instance of an object that can be used in a paddle | |
Examples: | |
.. code-block:: python | |
>>> import paddle | |
>>> paddle.hub.load_state_dict_from_url(url='https://paddle-hapi.bj.bcebos.com/models/resnet18.pdparams', model_dir="./paddle/test_load_from_url") | |
>>> paddle.hub.load_state_dict_from_url(url='https://x2paddle.bj.bcebos.com/resnet18.zip', model_dir="./paddle/test_file_is_zip") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sunzhongkai588 您看看还有什么问题吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for docs
PR Category
User Experience
PR Types
Improvements
Description
【Hackathon 7th No.19】NO.19为 Paddle 新增 load_state_dict_from_url API v1
RFC: