CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[Inference]Support TensorRT execute in PIR #64995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
#ifdef PADDLE_WITH_TENSORRT | ||
#include "paddle/fluid/framework/new_executor/instruction/instruction_base.h" | ||
#include "paddle/fluid/framework/new_executor/pir_adaptor/pir_adaptor_util.h" | ||
#include "paddle/fluid/inference/tensorrt/engine.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个engine类建议抽出来吧。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
engine.h里TensorRTEngine类本身就是独立的,没理解这个类抽出来是指的什么
max_input_shape_str)); | ||
} | ||
|
||
static phi::DataType TRT2FluidDataType(nvinfer1::DataType type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TRT2PaddleDataType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
谢谢,我改一下
auto op_attributes = op->attributes(); | ||
trt_engine_ = static_cast<TensorRTEngine *>( | ||
op_attributes.at("engine").dyn_cast<pir::PointerAttribute>().data()); | ||
max_batch_size_ = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_batch_size_这个模式可以不要了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
谢谢,我改一下
@@ -24,7 +29,8 @@ set(standalone_executor_deps | |||
garbage_collector | |||
executor_gc_helper | |||
device_event_base | |||
framework_proto) | |||
framework_proto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个必须依赖framework吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里tensorrt_engine_instruction与其他instruction共同编入standalone_executor这个target中,这个target对framework_proto是有依赖的,单独的tensorrt_engine_instruction不依赖
@@ -24,7 +29,8 @@ set(standalone_executor_deps | |||
garbage_collector | |||
executor_gc_helper | |||
device_event_base | |||
framework_proto) | |||
framework_proto | |||
pir_transforms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里似乎存在循环依赖,pir_transforms中的constant_folding_pass是依赖standalone_executor的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里我先还原吧,不过需要注意这里的问题:本身就是循环依赖,standalone_executor也要用到pd_op_to_kernel_pass, 这个是适配tensorrt前就有的问题,因为我在开发代码过程中偶现了这个问题就加了一下,后续如果不解决代码本身的循环依赖,依然可能暴露符号找不到的问题
#ifdef PADDLE_WITH_TENSORRT | ||
} else if (op.dialect()->name() == "trt_op") { | ||
CREATE_INSTR(TensorRTEngineInstruction); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为啥不是trt_kernel,其他判断都是xxx_kernel,这里突然是个xxx_op,挺奇怪的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是故意这样做的,因为这里是lower到kernel后的操作,但是对于tensorrt适配来说,Lower前和lower后的ir表示基本没啥变化,为了减少重复代码,这里没有再单独开发tensorrt kernel
void PrintType(pir::Type type, | ||
std::ostream& os) const override; // 用于打印type有关信息 | ||
void PrintAttribute(pir::Attribute type, std::ostream& os) | ||
const override; // 用于打印attribute有关信息 | ||
|
||
pir::OpPrintFn PrintOperation( | ||
pir::Operation* op) const override; // 用于打印operation有关信息 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
中文注释是否可以改成英文注释?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
谢谢,我改一下
const char* TensorRTEngineOp::attributes_name[8] = {"engine", | ||
"max_batch_size", | ||
"workspace_size", | ||
"allow_build_at_runtime", | ||
"input_names", | ||
"output_names", | ||
"origin_output_rank", | ||
"origin_outputs_dtype"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PIR下的TensorRTEngineOp一共有哪些属性需要再讨论下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,目前这里只是必要的属性先加入了,这个是迭代完善的过程,后续代码还会逐渐完善
: public pir::Op<TensorRTEngineOp, paddle::dialect::OpYamlInfoInterface> { | ||
public: | ||
using Op::Op; | ||
static const char *name() { return "trt_op.tensorrt_engine_op"; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
总感觉,不需要一个额外的trt_op dialect,tensorrt_engine_op仍属于pd_op下比较合适?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pd_op下的op都有共性,比如从yaml里提取的op信息,带有Interface等等,tensorrt_engine_op不具备pd_op下op的共性,故单独抽出来更好
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前已删除trt_op dialect,避免过度冗余设计,后续有需求再加上
if (paddle::dialect::IsTensorRTOp(op_item)) { | ||
HandleForTensorRTOp(ctx, | ||
op_item, | ||
kernel_key, | ||
place, | ||
map_op_pair, | ||
map_value_pair, | ||
new_block); | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PIR下所有TRT相关代码,建议都使用PADDLE_WITH_TENSORRT宏包裹下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
谢谢,我改一下
if (shape[d] < min_shape[d]) min_shape[d] = shape[d]; | ||
if (shape[d] > max_shape[d]) max_shape[d] = shape[d]; | ||
} | ||
opt_shape[d] = ShapeMaxFreq(counter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
挺好的设计哈哈
paddle::framework::ShapeMode shape_mode) -> py::list { | ||
py::list res; | ||
paddle::framework::CollectShapeManager::Instance() | ||
.StatisticShapeRangeInfo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我看到这个通过局部对象控制是否shape是否ready,从而是否收集shape。我觉得是否把StatisticShapeRangeInfo也单独暴露一个方法出来,这样首次collect shape之后,后续还可以继续collect shape。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: public pir::Op<TensorRTEngineOp, paddle::dialect::OpYamlInfoInterface> { | ||
public: | ||
using Op::Op; | ||
static const char *name() { return "pd_op.tensorrt_engine_op"; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
旧IR下算子名字叫tensorrt_engine,这里建议沿用此名,无需_op后缀?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
谢谢,我修改一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This reverts commit 101bf6e.
* adapt tensorrt * fix compile bugs * delete thirdparty * add unittest * fix py3 compile * fix kunlun200 * fix windows inference * fix windows bug * polish code * polish code * polish code * support build trt_op in python * rename construction params * fix bug * fix compile bugs * support collect shape * support re-collect shape * rename tensorrt op * polish code * add debug attr * delete member in tensorrt engine instruction * remove mutable_data * fix compile
PR Category
Inference
PR Types
New features
Description
Pcard-71500
本PR主要做如下工作:
1,开发TensorRTEngineOp,实现TensorRT在PIR下核心表达机制和注册注册机制
2,实现TensorRT在PIR下核心执行机制(lower to kernel变换,TRTEngineInstruction)
3,完成TensorRT三方基础封装组件重新开发优化
4,Python端支持使用tensorrt构建PIR图结构功能
5,完成CollectShape模块开发,支持运行时存储shape信息并且对shape范围进行统计,根据Program Value获取对应shape range
6,TensorRT在PIR下基础功能单测开发
TODO:
支持保存trt engine