CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 26.1k
[WIP] Add helpers for BLAS and OpenMP libs #13297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
It would be nice to add a test that check openmp actually take this into account.
You can use something like proposed in tommoral/loky#135 with opemmp.omp_get_num_threads
.
requires #13294 for the CI to pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a first batch of comments:
The coverage report seems to indicate that we don't get the clib tests run on windows: Checking the azure pipelines output seems to indicate that the codecov upload step was skipped: https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=501 Let me try to enable it in: #13332. |
Yes I added coverage to appveyor but we dropped appveyor :/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The failure on circle ci is caused by an old version of scikit-image and is unrelated to this PR.
|
Once https://github.com/tomMoral/loky/pull/135 is merged can we use most of this PR from loky then, or do we still need some cython files from this PR? |
We'll still need a helper to get the number of threads for openmp (similar to effective_n_jobs) since it's been decided to not include it in loky, and a test for that base on a prange. |
Besides, we'll need to vendor it in sklearn, because even if it's released, we use loky from joblib so we won't have it if we want to support several versions of joblib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments
Thanks a lot @jeremiedbb for your PRs on openmp support, that's making my life easier :)
@@ -83,6 +83,11 @@ def configuration(parent_package='', top_path=None): | |||
include_dirs=[numpy.get_include()], | |||
libraries=libraries) | |||
|
|||
config.add_extension("_threaded_libs_helpers", | |||
sources=["_threaded_libs_helpers.pyx"], | |||
include_dirs=[numpy.get_include()], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: you shouhldn't need to include numpy lib right?
# Check that an OpenMP library is loaded | ||
limits = get_thread_limits() | ||
|
||
assert not all([lib is None for lib in [limits['openmp_llvm'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use any(lib for lib...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would create a generator.
In [1]: any(True for i in range(2))
Out[1]: <generator object <genexpr> at 0x7fb5eff82308>
In [2]: any([True for i in range(2)])
Out[2]: True
|
||
if clib == "openblas" and SKIP_OPENBLAS: | ||
raise SkipTest("Possible bug in getting maximum number of threads with" | ||
" OpenBLAS < 0.2.16 and OpenBLAS does not expose it's " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its (without the ')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the typo
return n_threads | ||
|
||
|
||
def check_num_threads(int n=100): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only useful for the tests?
If that's the case then it should maybe be private, and that should be explained in the docstring so that developers don't think they might use it.
Also if I understand correctly, I would suggest the following docstring (since it's not obvious what default number of threads means here):
Run a short parallel section, and return the number of threads that where effectively used by openmp.
|
||
If None, does not do anything. | ||
|
||
subset : string or None, optional (default="all") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this redundant with the limits
parameter as a dict?
It's not clear to me why one would prefer one over the other.
Also the accepted keys of the dict for limits
should be documented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not redundant as the dict
for limit permits to select specific implementation (for instance you can have two different openblas for numpy
and scipy
) while the subset
(rename user_api
in the loky version) permit to set globally, for all library implementing the BLAS or openmp protocol some limits.
------- | ||
version : string or None | ||
None means OpenBLAS is not loaded or version < 0.3.4, since OpenBLAS | ||
did not expose it's verion before that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its
version
Thanks for the review @NicolasHug but most of the code here has changed. It's being developed in loky (https://github.com/tomMoral/loky/pull/135) and the plan is to vendor it in sklearn once it's done. Not sure if your comments apply to the current version of the code if you to take a look, go ahead :) |
OK, then maybe indicate this in the header to avoid unnecessary reviews |
yep, sorry about that |
Thanks for the review @NicolasHug , I took your comments into accounts in the |
For the record, the helpers to protect against over subscription between Cython prange and 3rd party threadpools is being better tested and improved in this repo: https://github.com/joblib/threadpoolctl/ Once ready, the plan is to contribute it to numpy and vendor a backport in scikit-learn in the mean time. |
If we are going to vendor threadpoolctl soon, do we still need this PR then? |
indeed |
Currently being developed in loky. Will be move here once done.
This PR adds a new submodule
_clibs
which provides helpers for the OpenMP and BLAS libs. It's main purpose is to avoid oversubscription in nested parallelism, e.g. BLAS calls inside an OpenMP prange.It's adapted from code of @tomMoral from Loky.
It also defines an
effective_n_jobs_openmp
which differs from joblib'seffective_n_jobs
whenn_jobs is None
in which case it's set toomp_get_max_threads
, as discussed irl with @ogrisel