CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
Implement ctc prefix beam search decode for TextRecognitionModel. #20524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ctc prefix beam search decode for TextRecognitionModel. #20524
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution!
Please take a look on the comments below.
modules/dnn/src/model.cpp
Outdated
beam = std::move(newBeam); | ||
} | ||
|
||
CV_Assert(beam.size() > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beam.size() > 0
Consider using empty call for that: !beam.empty()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
CV_Assert(beam.size() > 0); | ||
for (int token : beam[0].first) | ||
{ | ||
decodeSeq += vocabulary.at(token - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to add check to avoid out of range array access:
CV_Check(token, token > 0 && token <= vocabulary.size(), "")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
* only take top @p vocPrune tokens in each search step, @p vocPrune <= 0 stands for disable this prune. | ||
*/ | ||
CV_WRAP | ||
TextRecognitionModel& setDecodeOpts(int beam, int vocPrune = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, it makes sense to name this as setDecodeOptsCTCPrefixBeamSearch
to avoid confusions in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
* { | ||
* 'CTC-greedy': greedy decoding for the output of CTC-based methods | ||
* 'CTC-prefix-beam-search': Prefix beam search decoding for the output of CTC-based methods | ||
* } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation doesn't look well.
It makes sense to move possible values before the @param
statement.
Consider using the "list" mode through (-
) without {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Format refers to cv::dnn::readNet().
5668dbc
to
269e1de
Compare
Seems to me that there is some problem in CI system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done 👍
("OpenCV CN" builders are optional. They may fail due to periodic network issues)
Thank you for your time to review this code ❤️ |
269e1de
to
c67bfb9
Compare
The algorithm is based on Hannun's paper: First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs
c67bfb9
to
955cf35
Compare
The algorithm is based on Hannun's paper First-Pass Large Vocabulary
Continuous Speech Recognition using Bi-Directional Recurrent DNNs.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.