CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
cinn(backends): fuse last dim reduce loop and set bound of reduce thread #61576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
} | ||
return; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
试试直接这样:
if (context_->iter_space_info.sp_space.size() < loops.size() - 1) {
loops = sch->GetLoops(block_id);
std::vector<ir::Expr> rb_loops(
loops.begin() + context_->iter_space_info.sp_space.size(),
loops.end());
sch->Fuse(rb_loops);
}
if (context_->iter_space_info.sp_space.size() > 1) {
loops = sch->GetLoops(block_id);
std::vector<ir::Expr> sp_loops(
loops.begin(),
loops.begin() + context_->iter_space_info.sp_space.size());
sch->Fuse(sp_loops);
}
@@ -32,11 +32,14 @@ void TileTactic::Init(ScheduleContext* context) { | |||
} | |||
}; | |||
auto GetTreeReduceSize = [&](const ir::Expr& total_rb_extent) -> int64_t { | |||
int64_t nums_thread_per_block = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要出现幻数。这个1024怎么来的?
} | ||
return context_->bucket_info.rb_lower_bound; | ||
return nums_thread_per_block > 1024 ? 1024 : nums_thread_per_block; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要出现幻数
@@ -67,15 +64,14 @@ void AlignIterSpaceTactic::Apply(ir::IRSchedule* sch, | |||
if (context_->iter_space_info.sp_space.size() < loops.size() - 1) { | |||
loops = sch->GetLoops(block_id); | |||
std::vector<ir::Expr> rb_loops( | |||
loops.begin() + context_->iter_space_info.sp_space.size(), | |||
loops.end()); | |||
loops.end() - context_->iter_space_info.rb_space.size(), loops.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对rb_loops加注释。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
既然注释里说了是reduce_loops。应该使用更多的局部变量来加强语义:
const auto& rb_loops = [&]{
const auto reduce_loops_begin = loops.end() - context_->iter_space_info.rb_space.size();
const auto reduce_loops_end = loops.end();
return std::vector<ir::Expr>{reduce_loops_begin, reduce_loops_end};
}();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
下同
@@ -32,11 +33,17 @@ void TileTactic::Init(ScheduleContext* context) { | |||
} | |||
}; | |||
auto GetTreeReduceSize = [&](const ir::Expr& total_rb_extent) -> int64_t { | |||
int64_t max_threads_per_sm = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些地方一般都会习惯性的定义为const int64_t
std::vector<ir::Expr> sp_loops( | ||
loops.begin(), | ||
loops.begin() + context_->iter_space_info.sp_space.size()); | ||
loops.end() - context_->iter_space_info.rb_space.size()); | ||
sch->Fuse(sp_loops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
类似上面。我们可以使用lambda_init来加强语义:
sch->Fuse([&]{
const auto spatial_loops_begin = loops.begin();
const auto spatial_loops_end = loops.end() - context_->iter_space_info.rb_space.size();
return std::vector<ir::Expr>{spatial_loops_begin, spatial_loops_end};
}());
PR types
Others
PR changes
Others
Description
Pcard-72423