CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[Paddle TensorRT] Paddle-TensorRT support fp16 and dynamic graph APIs into static graph support predictor #69597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
python/paddle/tensorrt/converter.py
Outdated
@@ -93,6 +102,16 @@ def __init__(self, paddle_program, scope): | |||
# weights = trt.Weights(weight_array) | |||
param_dict.update({name: weight_array}) | |||
self.param_dict = param_dict | |||
global force_fp32_ops |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
非必要不要引入全局对象吧,这有点奇怪
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done,改为了全局单例
python/paddle/tensorrt/export.py
Outdated
@@ -150,6 +151,8 @@ def __init__( | |||
min_subgraph_size: int | None = 3, | |||
save_model_dir: str | None = None, | |||
disable_ops: str | list | None = None, | |||
tensorrt_precision_mode: str | None = "FP32", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
命名可以不加tensorrt_前缀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensorrt/export.py
Outdated
@@ -150,6 +151,8 @@ def __init__( | |||
min_subgraph_size: int | None = 3, | |||
save_model_dir: str | None = None, | |||
disable_ops: str | list | None = None, | |||
tensorrt_precision_mode: str | None = "FP32", | |||
tensorrt_ops_run_float: set | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是和旧ir的命名保持同步的
python/paddle/tensorrt/export.py
Outdated
- "INT32": 32-bit integer precision. | ||
- "BFP16": 16-bit Brain Floating Point precision. Only supported in TensorRT versions greater than 9.0. | ||
- "INT64": 64-bit integer precision. Only supported in TensorRT versions greater than 10.0. | ||
tensorrt_ops_run_float (set, optional): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上面disable_ops是 str|list类型,这里保持一致比较好
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为str|list类型
python/paddle/tensorrt/converter.py
Outdated
class PrecisionMode(Enum): | ||
FP32 = "FP32" | ||
FP16 = "FP16" | ||
BF16 = "BF16" | ||
INT8 = "INT8" | ||
|
||
@staticmethod | ||
def from_string(mode_str): | ||
mode_map = { | ||
"FP32": PrecisionMode.FP32, | ||
"FP16": PrecisionMode.FP16, | ||
"BF16": PrecisionMode.BF16, | ||
"INT8": PrecisionMode.INT8, | ||
} | ||
mode_upper = mode_str.upper() | ||
return mode_map.get(mode_upper, PrecisionMode.FP32) | ||
|
||
|
||
class TensorRTConfigManager: | ||
_instance = None | ||
|
||
def __new__(cls, *args, **kwargs): | ||
if not cls._instance: | ||
cls._instance = super().__new__(cls, *args, **kwargs) | ||
return cls._instance | ||
|
||
def _init(self): | ||
self.force_fp32_ops = [] | ||
|
||
def set_force_fp32_ops(self, ops): | ||
if ops is None: | ||
self.force_fp32_ops = [] | ||
elif isinstance(ops, str): | ||
self.force_fp32_ops = [ops] | ||
elif isinstance(ops, list): | ||
self.force_fp32_ops = ops | ||
else: | ||
raise ValueError("Ops should be a string, list, or None.") | ||
|
||
def get_force_fp32_ops(self): | ||
return self.force_fp32_ops | ||
|
||
|
||
# In TensorRT FP16 inference, this function sets the precision of specific | ||
# operators to FP32, ensuring numerical accuracy for these operations. | ||
def support_fp32_mix_precision(op_type, layer): | ||
trt_manager = TensorRTConfigManager() | ||
force_fp32_ops = trt_manager.get_force_fp32_ops() | ||
if op_type in force_fp32_ops: | ||
layer.reset_precision() | ||
layer.precision = trt.DataType.FLOAT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把这些都放到util.py里吧,不然容易造成循环引用
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensorrt/export.py
Outdated
precision_mode (str, optional): | ||
Specifies the precision mode for TensorRT optimization. The options are: | ||
- "FP32": 32-bit floating point precision (default). | ||
- "FP16": 16-bit floating point precision. | ||
- "INT8": 8-bit integer precision. | ||
- "BFP16": 16-bit Brain Floating Point precision. Only supported in TensorRT versions greater than 9.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为啥要传字符串,已经有了PrecisionMode,可以直接让用户传这个枚举吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensorrt/export.py
Outdated
program_with_trt = convert_to_trt(main_program, config, scope) | ||
return program_with_trt, scope | ||
|
||
|
||
# Obtain a program with tensorrt_op by directly loading the model. | ||
def convert_loaded_model(model_dir, config): | ||
def convert(model_dir, config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把这个接口名字也添加到__init__.py下吧,方便用户直接通过paddle.tensorrt.convert进行调用,你在单测里也最好使用paddle.tensorrt.convert来试一下看能否用到
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在test_export.py这样测试了,可以用
python/paddle/tensorrt/export.py
Outdated
@@ -150,6 +151,8 @@ def __init__( | |||
min_subgraph_size: int | None = 3, | |||
save_model_dir: str | None = None, | |||
disable_ops: str | list | None = None, | |||
precision_mode: str | None = "FP32", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里怎么还是str?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
python/paddle/tensorrt/export.py
Outdated
disable_ops : (str|list, optional): | ||
A string representing the names of operations that should not be entering by TensorRT (default is None). | ||
precision_mode (str, optional): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上,注意细节
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
python/paddle/tensorrt/converter.py
Outdated
precision_mode = PrecisionMode.from_string( | ||
self.trt_config.precision_mode | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
都直接传入PrecisionMode了,这里from_string还有必要吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensorrt/util.py
Outdated
def from_string(mode_str): | ||
mode_map = { | ||
"FP32": PrecisionMode.FP32, | ||
"FP16": PrecisionMode.FP16, | ||
"BF16": PrecisionMode.BF16, | ||
"INT8": PrecisionMode.INT8, | ||
} | ||
mode_upper = mode_str.upper() | ||
if mode_upper not in mode_map: | ||
_logger.warning( | ||
f"Unsupported precision mode: '{mode_str}'. Defaulting to FP32." | ||
) | ||
return mode_map.get(mode_upper, PrecisionMode.FP32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个还有必要吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
PR Category
Inference
PR Types
New features
Description
card-71500
本pr支持了pir-trt float16精度推理,并支持了将某些op强制转换成float32精度推理,本pr支持了动转静场景下可以使用predictor.run,之前使用executor.run,并修改了接口,动转静场景下使用paddle.tensorrt.export.convert,直接加载json或者pdmodel模型使用paddle.tensorrt.export.convert