CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
refine dist.to_static #60175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refine dist.to_static #60175
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
self.run_dynamic() | ||
self.run_dy2static() | ||
self.run_llama(to_static=0) | ||
self.run_llama(to_static=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里两个的 to_static 参数都是0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thx
): | ||
""" | ||
Constructs a ``paddle.Tensor`` with distributed attributes from ``data``, | ||
which can scalar, tuple, list, numpy.ndarray, paddle.Tensor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creates a distributed tensor with distributed attributes from the input data, which can be a scalar, tuple, list, numpy.ndarray, or paddle.Tensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx
|
||
For more details of the usage, please refer to the sample code in | ||
``paddle.distributed.to_static``. | ||
If the ``data`` is already a Tensor, transform it to a Distributed Tensor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the input data is a paddle.Tensor
, it will be transformed into a distributed tensor.
dtype(str|np.dtype, optional): The desired data type of returned tensor. Can be 'bool' , 'float16' , | ||
'float32' , 'float64' , 'int8' , 'int16' , 'int32' , 'int64' , 'uint8', | ||
'complex64' , 'complex128'. Default: None, infers dtype from ``data`` | ||
except for python float number which gets dtype from ``get_default_type`` . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没看懂这句话呀。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refined, thx
``pipeline`` is used to configure the pipeline parallelism strategy. | ||
def input_fn(inputs, process_mesh) -> list(paddle.Tensor) | ||
|
||
In general, the type of `input_fn` return value is paddle.Tensor with DistTensor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddle.Tensor with DistTensor
这个到底是啥意思? 前面还有描述成
- a
paddle.Tensor
with distributed attributes - a distributed
paddle.Tensor
with given distributed attributes - a Distributed Tensor.
但是我在python端没有找到 paddle.DistTensor
, 猜测是不是: paddle.distributed.shard_tensor
?
By default we do not shard or convert the output. | ||
Returns: | ||
Layer: A layer that contains parameters/buffers | ||
that are all `paddle.Tensor` with DistTensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上。
paddle.Tensor
with DistTensor 到底是什么?
mode (str|None, optional): Can be 'train' , 'eval' , 'predict' or None. | ||
'train' : Return the serial startup program for training. | ||
'eval' : Return the serial startup program for evaluation. | ||
'predict' : Return the serial startup program for prediction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eval 和 predict 啥区别?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eval最后会输出loss,predict最后输出logits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
… dev/refine_to_static
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
APIs
Description
refine dist.to_static, support
_ShardOptimizer
as input paramPcard-76459