CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[Paddle TensorRT] Support inference using predictor.run #67755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
python/paddle/tensorrt/util.py
Outdated
@@ -110,3 +110,22 @@ def warmup_shape_infer(program, min_shape_feed, max_shape_feed): | |||
executor.run( | |||
program, feed=max_shape_feed, fetch_list=[output_var] | |||
) | |||
|
|||
|
|||
def warmup_shape_infer_v2( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
统一使用一个warmup_shape_infer,不要整v1,v2这种情况
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改为统一的warmup_shape_infer
) | ||
|
||
|
||
class TestConverterResNet50(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你这测的不是resnet50吧,这个文件直接改成test_predict_model吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改为test_paddle_to_tensorrt_conversion_cumsum
python/paddle/tensorrt/predictor.py
Outdated
if self.input_data_type and not self.input_range: | ||
if 'int' in self.input_data_type: | ||
self.input_min_data = np.random.randint( | ||
1, 10, size=self.min_input_shape | ||
) | ||
self.input_max_data = np.random.randint( | ||
1, 10, size=self.max_input_shape | ||
) | ||
else: | ||
low, high = self.input_range | ||
self.input_min_data = np.random.uniform( | ||
low, high, size=self.input_min_data | ||
).astype(self.input_data_type) | ||
self.input_max_data = np.random.uniform( | ||
low, high, size=self.input_max_data | ||
).astype(self.input_data_type) | ||
|
||
elif self.input_data_type and self.input_range: | ||
low, high = self.input_range | ||
self.input_min_data = np.random.uniform( | ||
low, high, size=self.min_input_shape | ||
).astype(self.input_data_type) | ||
self.input_max_data = np.random.uniform( | ||
low, high, size=self.max_input_shape | ||
).astype(self.input_data_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段代码写的有点绕,可以提前判断给input_data_type和input_range设置默认值,在生成随机数的过程中就不用判断input_data_type和Input_range是否存在了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
python/paddle/tensorrt/predictor.py
Outdated
else: | ||
self.input_min_data = np.ones(self.min_input_shape).astype( | ||
'float32' | ||
) | ||
self.input_max_data = np.ones(self.max_input_shape).astype( | ||
'float32' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
统一使用随机数生成,这里的代码感觉没啥意义
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
python/paddle/tensorrt/predictor.py
Outdated
@@ -0,0 +1,275 @@ | |||
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件改个名字吧,叫export.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为export.py
python/paddle/tensorrt/predictor.py
Outdated
input_min_data=None, | ||
input_max_data=None, | ||
input_optim_data=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有input了这还有用吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
# Step1.1: get original results(for tests only) | ||
pred_res = predict_program( | ||
program, | ||
{feed_name[0]: input_data_min_shape}, | ||
fetch_var_list=output_var, | ||
) | ||
|
||
# # Step2: run warmup for collecting shape | ||
warmup_shape_infer_v2( | ||
program, | ||
min_shape_feed={feed_name[0]: input_data_min_shape}, | ||
max_shape_feed={feed_name[0]: input_data_min_shape}, | ||
fetch_var_list=output_var, | ||
) | ||
|
||
# # Step3: run pir pass(including some fusion pass and trt_op_marker_pass) | ||
# program_with_pir = run_pir_pass(program, partition_mode=False) | ||
|
||
program_with_pir = run_pir_pass(program, partition_mode=True) | ||
|
||
output_var = program_with_pir.list_vars()[-2] | ||
|
||
trt_output_var = [] | ||
|
||
for op in program_with_pir.global_block().ops: | ||
if op.name() == "pd_op.fetch": | ||
for operand in op.operands(): | ||
source = operand.source() | ||
trt_output_var.append(source) | ||
|
||
# Step5: run TRTConverter(would lower group_op into tensorrt_engine_op) | ||
converter = PaddleToTensorRTConverter(program_with_pir, scope) | ||
converter.convert_program_to_trt() | ||
|
||
# #executor.run | ||
# output_converted=predict_program( | ||
# program_with_pir, | ||
# {feed_target_names[0]: input_data_min_shape}, | ||
# trt_output_var | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些逻辑用新封装的接口来替代
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些都还没改呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改完毕,上述代码是跑通时候提交的
test/tensorrt/get_program.py
Outdated
def get_program(model_dir, prefix, use_pir=False): | ||
""" | ||
Load a PaddlePaddle inference model with optional PIR API support. | ||
|
||
Args: | ||
model_dir (str): The directory where the model and parameters are stored. | ||
prefix (str): The prefix of the model files without file extension. | ||
use_pir (bool, optional): Flag to determine if PIR API should be used. Default is False. | ||
|
||
Returns: | ||
tuple: A tuple containing the loaded program, scope, feed target names, fetch targets, and model filename. | ||
""" | ||
scope = paddle.static.global_scope() | ||
place = paddle.CUDAPlace(0) | ||
exe = static.Executor(place) | ||
|
||
# Check if we should use PIR API | ||
if use_pir: | ||
# Use PIR API context manager if required | ||
model_filename = os.path.join(model_dir, prefix + ".json") | ||
params_filename = os.path.join(model_dir, prefix + ".pdiparams") | ||
else: | ||
model_filename = os.path.join(model_dir, prefix + ".pdmodel") | ||
params_filename = os.path.join(model_dir, prefix + ".pdiparams") | ||
|
||
with paddle.pir_utils.IrGuard(): | ||
# Load the model | ||
[program, feed_target_names, fetch_targets] = ( | ||
paddle.static.io.load_inference_model( | ||
model_dir, | ||
executor=exe, | ||
model_filename=model_filename, | ||
params_filename=params_filename, | ||
) | ||
) | ||
return program, scope, feed_target_names, fetch_targets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
前边已有类似功能,这个函数是不是可以删掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个函数得加载用的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用前边的get_trt_program不可以吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
哦哦对,忘了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_program已经删除,跑通阶段提交的代码
python/paddle/tensorrt/predictor.py
Outdated
op.set_bool_attr("__l_trt__", False) | ||
|
||
|
||
def converter_trt_program(program, trt_config, scope): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改个名字吧,叫convert_to_trt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为convert_to_trt
python/paddle/tensorrt/converter.py
Outdated
if not source.initialized(): | ||
_logger.warning(f"Skipping uninitialized source: {source}") | ||
continue | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有了上边的if这里不需要再else了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensorrt/export.py
Outdated
def forbid_op_lower_trt(self, program, disabled_ops): | ||
if isinstance(disabled_ops, str): | ||
disabled_ops = [disabled_ops] | ||
for op in program.global_block().ops: | ||
if op.name() in disabled_ops: | ||
op.set_bool_attr("__l_trt__", False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里不需要这个函数吧,util.py里已经有类似的功能了,直接在converter_to_trt里调用不行吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在util.py中保留了
python/paddle/tensorrt/export.py
Outdated
|
||
|
||
# return an optimized program with pd_op.tensorrt_engine operations. | ||
def converter_to_trt(program, trt_config, scope): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改个名吧,叫convert_to_trt怎么样,converter是名词吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensorrt/util.py
Outdated
if fetch_var_list is None: | ||
for _ in range(1): | ||
output_original = executor.run( | ||
program, feed=min_shape_feed, fetch_list=[output_var] | ||
) | ||
|
||
# Run the program with input_data_max_shape (fake max_shape input) | ||
for _ in range(1): | ||
executor.run( | ||
program, feed=max_shape_feed, fetch_list=[output_var] | ||
) | ||
else: | ||
for _ in range(1): | ||
output_original = executor.run( | ||
program, feed=min_shape_feed, fetch_list=fetch_var_list | ||
) | ||
|
||
# Run the program with input_data_max_shape (fake max_shape input) | ||
for _ in range(1): | ||
executor.run( | ||
program, feed=max_shape_feed, fetch_list=fetch_var_list | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段代码可以简化吧,直接将output_var放到fetch_var_list里不就可以了吗,不需要写这么多executor.run吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
model_dir = self.save_path | ||
# Obtain tensorrt_engine_op by passing the model path and trt_config.(converted_program) | ||
program_with_trt = export_loaded_model(model_dir, trt_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
再加一个单测吧,组网可以不用太复杂,测试export接口的,另外这个单测名字改一下吧,叫test_export.py吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensorrt/export.py
Outdated
else: | ||
for param_or_buffer in concrete_program.parameters: | ||
if param_or_buffer.type == core.VarDesc.VarType.VOCAB: | ||
scr_tensor = param_or_buffer.value().get_map_tensor() | ||
tgt_var = scope.var(param_or_buffer.name) | ||
tgt_var.set_vocab(scr_tensor) | ||
else: | ||
# share to scope | ||
param_or_buffer_tensor = scope.var( | ||
param_or_buffer.name | ||
).get_tensor() | ||
src_tensor = ( | ||
state_var_dict[param_or_buffer.name] | ||
.value() | ||
.get_tensor() | ||
) | ||
param_or_buffer_tensor._share_data_with(src_tensor) | ||
# record var info | ||
if param_or_buffer.name not in extra_var_info: | ||
extra_info_dict = {} | ||
if param_or_buffer.name in state_names_dict: | ||
extra_info_dict['structured_name'] = ( | ||
state_names_dict[param_or_buffer.name] | ||
) | ||
extra_info_dict['stop_gradient'] = ( | ||
param_or_buffer.stop_gradient | ||
) | ||
if isinstance(param_or_buffer, EagerParamBase): | ||
extra_info_dict['trainable'] = ( | ||
param_or_buffer.trainable | ||
) | ||
extra_var_info[param_or_buffer.name] = extra_info_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些代码跑不到吧,我们这个接口就是在pir模式下跑的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实,删除
python/paddle/tensorrt/export.py
Outdated
|
||
|
||
# Obtain a program with tensorrt_op for dynamic-to-static scenarios. | ||
def export(funtion=None, input_spec=None, config=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def export(funtion=None, input_spec=None, config=None, **kwargs): | |
def export(function=None, input_spec=None, config=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
python/paddle/tensorrt/export.py
Outdated
extra_var_info[param_or_buffer.name] = extra_info_dict | ||
|
||
with paddle.pir_utils.IrGuard(): | ||
main_program = static_net.forward.main_program |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果支持函数,那么这里这样是有问题的,不应该用 static_net.forward.main_program
而是应该用前面构造好的 concrete_program.main_program
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
python/paddle/tensorrt/export.py
Outdated
combine_vars = {} | ||
combine_program = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这俩应该都没用
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
python/paddle/tensorrt/export.py
Outdated
optim_input_shape : tuple, optional | ||
The shape of the optimal input tensor (default is None). | ||
input_data_type : str, optional | ||
The data type for the input tensors, such as 'float32' or 'int64' (default is None). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
当前仅支持float32、int64种类型吗?int32、bool类型是否支持? 下面代码里看,default是float32类型,这里写None不精确。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int,float类型都支持,bool没支持,我改下
place = paddle.CUDAPlace(0) | ||
exe = paddle.static.Executor(place) | ||
|
||
paddle.static.save_inference_model( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
现在统一的静态图存储接口是paddle.jit.save,用这个接口有啥问题?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
save_inference_model可以对program进行保存,paddle.jit.save接受的参数是一个Layer
test/tensorrt/test_convert.py
Outdated
return out | ||
|
||
|
||
class TestConvert_loaded_model(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
类名规范一点,驼峰式写法
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/tensorrt/test_convert.py
Outdated
@@ -0,0 +1,173 @@ | |||
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文件名要不还叫test_export吧,因为我们测试的接口都在export.py里写的,同时也为了和其他单测名字区分开来
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
trt_output_var = [] | ||
|
||
for op in program_with_pir.global_block().ops: | ||
if op.name() == "pd_op.fetch": | ||
for operand in op.operands(): | ||
source = operand.source() | ||
trt_output_var.append(source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段代码有逻辑问题,需要将其放到converter.convert_program_to_trt()之后才行
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensorrt/export.py
Outdated
inputs: list, | ||
min_subgraph_size: int | None = 3, | ||
save_model_dir: str | None = None, | ||
disable_ops: str | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disable_ops应该是一个列表吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以是str或者list,如果是一个单独的str,就把它转换成list,如果是list,就保持list
python/paddle/tensorrt/export.py
Outdated
disable_ops : str, optional | ||
A string representing the names of operations that should not be entering by TensorRT (default is None). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
�注释也改一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
PR Category
Inference
PR Types
Others
Description
card-71500
本PR设计了
1、Paddle TensorRT用户接口,包括paddle.tensorrt.Input,
paddle.tensorrt.TensorConfig, paddle.tensorrt.export, paddle.tensorrt.export_loaded_model四个接口
2、以及使用predictor跑通新ir下的paddle-tensorrt,修复了tensorrt_op的data_layout的问题,修复了paddle.static.save_inference_model接口调用get_pir_parameters函数时候,program中var无 is_parameter属性导致无法保存的问题
3、增加了pybind中scope方法 local_var_names,此功能可查看scope中的var的name,方便调试
4、解决了动转静接口中,无法拿到start_program,导致warmup_shape_infer函数中执行executor时,报scope找不到var的问题
1.本PR提供了四个用户接口
一:
paddle.tensorrt.Input,提供用户输入
参数:
min_input_shape:最小输入shape
max_input_shape:最大输入shape
optim_input_shape:最优输入shape
input_data_type:输入类型,比如float32,int32等
input_range:输入的范围,比如(0,1)
二:
paddle.tensorrt.TensorRTConfig:
参数:
inputs:这里是为了适配模型有多个输入。
min_subgraph_size:最小形成tensorrt op的最小paddle子图数量
save_model_dir:将带有tensorrt op的program保存成json的路径
disable_ops:禁止进入trt的op名称
三:
paddle.tensorrt.export 动转静场景使用,使用此接口,将动态图模型转为静态图,并得到优化后的带有tensorrt op的program
四:
paddle.tensorrt.export_loaded_model 直接加载模型(pdmodel或json)场景使用,得到优化后的带有tensorrt op的program