CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 930
Releases: cupy/cupy
v13.5.1
f450813
Compare
This is the release note of v13.5.1. This is a hot-fix release to address an issue related to the buffer protocol support for UMP added in v13.5.0 (#9223). See here for the complete list of solved issues and merged PRs.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
π Changes
Bug Fixes
- Fix buffer protocol to raise TypeError when it is not meant to be supported (#9222)
Installation
π₯ Contributors
The CuPy Team would like to thank all those who contributed to this release!
Assets 33
- sha256:3dba2f30258463482d52deb420862fbbbaf2c446165a5e8d67377ac6cb5c0870
2025-07-10T16:52:42Z - sha256:c0bb65a8d811082c9410459b9408c695c19643548cfea8c4201dfbeea88ee9be
2025-07-10T16:52:03Z - sha256:9c8ac7a1ee814d8ba38482fb66555e840e3d0bbd3134326c3fbb2ad37f5b9b47
2025-07-10T16:52:13Z - sha256:1a6bf75ae16718fd63e9eb4f93d7c19c970893c1d8eb205060f972f947e06cfc
2025-07-11T03:14:37Z - sha256:961d4ec533bd01402a642b244fff8e47ae473783de10388316f328e1f183975a
2025-07-10T16:52:03Z - sha256:ab4a2072698615c0a2c240c30acf587136f4bd7c8c3bd621cfa93fd661e02c9d
2025-07-10T16:52:13Z - sha256:f338bf8f9a03fd5df72694604176a1d6e4526f6c0eb0617b610d9fd077d7ad30
2025-07-11T03:14:37Z - sha256:a9a59716ad1b227c8d42b93150c09fde4618a161b743b50a59a225dfc1bc408a
2025-07-10T16:52:03Z - sha256:a8435ab070ccad606a9bf55dae612dc559c2deb471637142c32bd3a8bbf09a67
2025-07-10T16:52:14Z - sha256:e32c127e8e670278ba4160a49f591bff3c061037bd5fb1f30a0a99f508a91b53
2025-07-11T03:14:47Z -
2025-07-11T04:58:53Z -
2025-07-11T04:58:53Z - Loading
v13.5.0
e30a0cc
Compare
Note
2025-07-11: We have marked this release as "yanked" on PyPI to prevent new installations due to unexpected regressions. The hot-fix release v13.5.1 is available.
This is the release note of v13.5.0. See here for the complete list of solved issues and merged PRs.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
β¨ Highlights
- CuPy now supports NVIDIA CUDA 12.9 and AMD ROCm 6.4 platforms, and NumPy 2.3.
- Unified Memory Programming support for HMM/ATS-enabled systems (such as NVIDIA Grace Hopper Superchip) has been added. Refer to the documentation for the usage.
- Binary packages on PyPI (wheels) can now load NCCL packages installed via Pip (e.g.,
nvidia-nccl-cu12
). In addition, Arm (aarch64) wheels are now built with NCCL support enabled.
Request for Comments
We are going to finalize the following RFC issues.
- Drop support for cuDNN in CuPy v14 (#8215)
- Update set of supported ROCm versions in CuPy v13/v14 (#8607)
- Remove
cupyx.tools.install_library
in CuPy v14 (#9204)
π Changes
New Features
- Support system allocated memory (#9033)
Enhancements
- Fix rocThrust build for ROCm 6.3 (#9023)
- Allow discovering cuTENSOR using major version (#9037)
- Support FIPS enabled machines with MD5 hashing (#9055)
- Update cutensornet accelerator based on cuquantum-python 25.03 deprecation (#9058)
- Refactor hashing (#9059)
- Raise user warning in both
{to,from}Dlpack
& Update the Interoperability page (#9061) - Allow build on ROCm 6.4 (#9100)
- Migrate to pyproject.toml (#9135)
- Support NCCL for aarch64 (#9141)
- Support loading NCCL from Pip packages (#9208)
- Support CUDA 12.9 and NCCL 2.26 (#9211)
- Fix
cupyx.scipy.stats.zscore
for SciPy 1.15 (#9024)
Performance Improvements
- Implement lazy load for cuquantum (#9104)
Bug Fixes
- JIT: Support empty return (#9001)
- API: Revert
toDlpack()
default to the old unversioned one (#9007) - BUG: Hot fix for numpy 2 support in some fusion paths (#9012)
- Fix compilation error of
cupy.inf
in fusion2 (#9043) - Support Cython 3.1 (#9132)
- Fix
cupyx.scipy.linalg.expm
(#9144)
Code Fixes
- Fix
get_typename
to emitthrust::complex
(#9054)
Documentation
- Add an AI policy to prohibit misuse of the issue tracker (#9095)
- Update ROCm docs (#9108)
- Docs: Update build-time requirement of Cython (#9145)
- Fix WARNING: Inline emphasis start-string without end-string (#9168)
- Improve API reference list (#9189)
- Bump supported NumPy version to v2.3 (#9203)
Installation
- Limit Cython version to 3.0 or 3.1 (#9146)
- Bump NumPy version restriction (#9166)
- Make rebuild faster for development (#9196)
- Bump version to v13.5.0 (#9212)
Tests
- CI: Do not run full CI on CUDA 12.0/12.1/12.2 + Windows (#9000)
- CI: Pin setuptools version on Windows (#9039)
- Revert "CI: Pin setuptools version on Windows" (#9056)
- Mark xfails in some spline tests for SciPy 1.15 (#9060)
- Support SciPy 1.15 (#9063)
- Skip some dtype checks with NumPy 2.x (#9064)
- Skip tests for different behavior of integer overflow from NumPy 2 (#9072)
- Skip some cupyx.scipy.special tests for SciPy 1.15 (#9073)
- Skip some tests for numerical error from NumPy 2 (#9075)
- Do
gc.collect()
in MemoryHook test code to avoidfree
hook to happen (#9093) np.sum
has numerical change in NumPy 2.3 (#9169)- Fix
cp.empty(None)
to raiseTypeError
(#9174) - CI: NumPy 2.3 (#9194)
- Add NumPy 2.3 + windows CI (#9197)
- Update pre-commit settings (#9202)
- Add
cupy.win.cuda129
CI (#9214) - Fix test trigger phrase for
cupy.win.cuda129
CI (#9217)
Others
- Allow specifying no libraries when generating wheel metadata (#9080)
- Upgrade
pre-commit
hooks (#9156)
π₯ Contributors
The CuPy Team would like to thank all those who contributed to this release!
@asi1024 @Azusachan @EarlMilktea @ev-br @jakirkham @kmaehashi @leofang @MattTheCuber @rongou @seberg @yangcal
Assets 33
v14.0.0a1
1e8ade1
Compare
This is the release note of v14.0.0a1. See here for the complete list of solved issues and merged PRs.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
β¨ Highlights
This is the first alpha release of the CuPy v14 series, containing:
- New type promotion rules and behaviors aligned with the NumPy 2 specification.
- 42 new NumPy/SciPy-compatible APIs, including
cupy.concat
,cupyx.scipy.interpolate.CubicSpline
,cupyx.scipy.spatial.Delaunay
,cupyx.scipy.ndimage.find_objects
, andcupyx.scipy.special.lambertw
. See the Comparison Table for the detailed coverage.
Binary packages are available for testing. Try installing now by:
$ pip install cupy-cuda12x --pre -U -f https://pip.cupy.dev/pre
π οΈ Changes without compatibility
- CuPy v14βs behavior will be aligned with NumPy v2.
- Type promotion rules are now NEP50 compatible. See Changes to NumPy data type promotion.
int
is now 64-bit (int64
) on Windows. See Windows default integer.- APIs removed in NumPy v2 (see Changes to namespaces) were marked deprecated in CuPy v14. Although they are kept available in v14 for smooth migration, they are planned to be removed in the next major release (CuPy v15).
- The behavior of
copy
argument has been changed (#8545). See Adapting to changes in thecopy
keyword.
- Support for Python 3.9, NumPy 1.22 and 1.23, SciPy 1.7, 1.8, and 1.9 has been dropped. (#8491)
cupy.random.choice
may return different results from CuPy v13. (#8483)- Building CuPy from source code now requires Cython 3.0. (#8457)
cupyx.scipy.linalg.{tri,tril,triu}
APIs were removed from CuPy to follow the latest SciPyβs specification. Usecupy.{tri,tril.triu}
instead. (#8499)- NumPy fallback mode (
cupyx.fallback_mode
) has been removed as discussed in #8497. (#8816) - Legacy DLPack APIs (
cupy.toDlpack
andcupy.fromDlpack
) are now marked deprecated. Usecupy.from_dlpack
instead. See the documentation for the usage. (#8831)
π Changes
New Features
- Add
KDTree
tocupyx.scipy.spatial
(#7671) - Add
neighbors
option toRbfInterpolator
(#7864) - ENH: cupyx/signal: add
sweep_poly
(#7873) - Add 2D Delaunay triangulation (#7985)
- Add
cupyx.signal.pulse_compression
from cuSignal's non SciPy-compat API (#8022) - Add
LinearNDInterpolator
tocupyx.scipy.interpolate
(#8035) - Add
cupyx.signal.convolve1d3o
from cuSignal's non SciPy-compat API (#8037) - Add
cupyx.signal.{firfilter,firfilter_zi,firfilter2}
(#8052) - Add
cupyx.signal.{pulse_doppler, cfar_alpha}
(#8057) - Add
cupyx.signal.{complex_cepstrum,real_cepstrum,inverse_complex_cepstrum,minimum_phase}
(#8062) - Add
cupyx.signal.mvdr
(#8077) - ENH: signal: add
lanczos
andkaiser_bessel_derived
windows (#8081) - Add
cupyx.signal.ca_cfar
(#8087) - Add
cupyx.signal.convolve1d2o
(#8101) - Add
cupyx.signal.freq_shift
(#8128) - Add
lambertw
function (#8140) - Add
cupyx.signal.channelize_poly
(#8141) - Add cupyx.scipy.interpolate.CubicSpline (#8175)
- Add
apply_over_axes
API (#8177) - Add
cupy.put_along_axis
API (#8199) - Add
CloughTocher2DInterpolator
tocupyx.scipy.interpolate
(#8208) - Add
NearestNDInterpolator
tocupyx.scipy.interpolate
(#8220) - Add
NdBSpline
tocupyx.scipy.interpolate
(#8223) - ENH: cupyx/scipy/interpolate: add *UnivariateSpline for 1D smoothing splines (#8267)
- Add NdBSpline based interpolation methods to RGI (#8276)
- ENH: cupyx/interpolate: port
interp1d
from scipy (#8289) - Add batched solve_triangular (#8329)
- Add Incomplete Elliptic Integrals to special (#8425)
- Support system allocated memory (#8442)
- Add CUDA graph debug function (#8502)
- Add
sici
andshichi
to special for sine and cosine integrals (#8620) - Update
unique_xxx
(nep52) (#8665) - Add
cupyx.scipy.ndimage.find_objects
(#8916)
Enhancements
- Support for break and continue keywords in CuPy JIT (#8010)
- Make
cupyx.signal.radartools
private (#8047) - Remove usages of
numpy.float_
andnumpy.complex_
(#8050) - Support cusparseLt 0.6.1 (#8074)
- Add incontiguous support for cutensor functions (#8149)
- Add complex support for the digamma function (#8163)
- Fix
expm(complex matrix)
(#8206) - Add CutensorMg support (#8212)
- Add
cudaStreamCreateWithPriority
(#8219) - Add the
nearest
method for percentile/quantile estimation (#8224) - Various Jitify improvements (#8235)
- Support fallback algorithm for spgemm (#8252)
- Bump to cuTENSOR 2.0.1 (#8282)
- Preload cuTENSORMg (#8283)
- Use
weakref.finalize
instead of__del__
forRandomState._generator
destruction (#8315) - Support ROCm 6 (#8319)
- cupyx: cleanup use of deprecated NumPy functionality (NumPy 2.0 compatibility) (#8320)
- Add wright_bessel function to special (#8324)
- MAINT: fft, linalg: add
__all__
lists (#8333) - Cuda 12.5 Tests (#8337)
- Add axes support in ndimage filters module (#8339)
- MAINT: interpolate: update RBF to scipy 1.13 (#8343)
- Make CuPy import under NumPy 2.0 (#8346)
- Lazy-preload NCCL (#8360)
- Fix
map_coordinates
recompilation condition (#8378) - Disable jitify for cub & Bump CCCL (#8412)
- Use custom less instead of specializing thrust (#8446)
- Port to Cython 3.0 (#8457)
- Avoid using Jitify everywhere inside CuPy (#8467)
- Get rid of
pkg_resources
(#8480) - Drop support for Python 3.9, NumPy 1.22 and 1.23, SciPy 1.7, 1.8 and 1.9 (#8491)
- Remove deprecated
cupyx.scipy.linalg.{tri,tril,triu}
(#8499) - Use
.toarray()
instead of.A
attribute (#8508) - Support
half
option inscipy.signal.minimum_phase
(#8510) - Increase
MAX_NDIM
to 64 (#8511) - Support CUDA 12.6 (#8513)
- Fallback to system headers for future CUDA 12.x versions (#8518)
- Extend runtime header search logic to conda (#8519)
- Support
copy=None
incp.array
/cp.asarray
/cp.asanyarray
(#8545) - Fix dtype rule of
cupy.scipy.stats.entropy
for SciPy 1.14 (#8547) - Support setuptools 74.0.0 or later (#8583)
- Add
NCCL_ERROR_REMOTE_ERROR
to the set of errors from NCCL (#8662) - Replace
numpy.ComplexWarning
withcupy.exceptions.ComplexWarning
(#8676) - ENH: Implement dlpack v1 (#8683)
- Fix some NumPy 2.x CI failures (cont.) (#8695)
- Bump CUDA version in cuda11x-cuda-python CI (#8737)
- [ROCm 6.2.2] Conditionally define CUDA_SUCCESS only if it's not (#8793)
- Remove fallback mode (#8816)
- Raise user warning in both
{to,from}Dlpack
& Update the Interoperability page (#8831) - Use a custom Min/Max instead of specializing CUB (#8846)
- Updating pylibraft
pairwise_distance
to cuvs (#8847) - add axes support for additional functions in cupyx.scipy.ndimage (from SciPy 1.15.0) (#8858)
- Raise VisibleDeprecationWarning for wavelet functions (#8865)
- Support CUDA 12.8 + Blackwell GPUs (sm_100, sm_120) (#8899)
- Bump library installers for CUDA 12.8 (#8914)
- Use CCCL 2.8.x branch + Use
CUPY_CACHE_KEY
in hash keys (#8919) - Use NVIDIA CCCL 2.8 latest w/CUDA 12.3 fix (#8924)
- Use C++17 in JIT compile (#8940)
- Restore CUB histogram and bincount (#8950)
- Broaden usage of C++17 (#8952)
cupyx.scipy.distance
: initialize output array with empty instead of zeros (#8971)cupyx.scipy.spatial.distance.cdist
remove explicit zeroing of user-provided output array (#8988)- Fix rocThrust build for ROCm 6.3 (#9022)
- Allow discovering cuTENSOR using major version (#9030)
- Update cutensornet accelerator based on cuquantum-python 25.03 deprecation (#9045)
- Support FIPS enabled machines with MD5 hashing (#9053)
- Refactor hashing (#9057)
Enhancements for NumPy & SciPy compatibility:
- Fix
scp.signal.{medfilt,medfilt2d}
to raise ValueError for complex64 inputs (#8059) - Deprecate
cupyx.scipy
wavelet functions (#8061) - Fix
csrmatrix.__pow__
to raise ValueError for non-int other (#8063) - Fix
cupyx.scipy.special.betainc
for invalid inputs (#8065) scipy.special.{btdtr,btdtri}
are deprecated since SciPy 1.12 (#8066)- Fix
boxcox_llf
for SciPy 1.12 changes (#8095) - NEP50 (#8323)
- Resolve Ruff
NPY
errors - fix exception imports andasfarray
usage in test code (#8455) - Fix
sparse.linalg
function signatures following SciPy 1.14 (#8526) - NumPy 2.0 compatibility: (partially) sync with NEP52 (#8531)
- Fix dtype rule of special functions for SciPy 1.14 (#8532)
- Fix
cupy.histogram
arg order to match NumPy (v1.24+) (#8559) - Make
cupy.linalg.solve
compatible withnumpy
v2 (#8629) - Silence
FutureWarning
emitted whenrcond
is missing (#8638) - Fix some NumPy 2.x CI failures (#8690)
- Support
kind
arg. in sorting methods (#8708) - Fix
cupy.percentile
for NumPy 2.x (#8726) - Fix some NumPy 2.x CI failures (cupyx) (#8727)
- Skip some tests incompatible with NumPy 2.2 (#8817)
- Fix scipy.spmatrix.sign for complex dtype inputs (#8822)
- Fix return type of
cupy.where
for scalar arguments for NumPy 2.0 (#8835) - Fix
cupyx.scipy.special.logsumexp
for NumPy 2.0 (#8836) - Fix
cupy.cov
(#8839) - Fix
cupy.histogramdd
for NumPy 2.x (#8873) - Raise ValueError upon attempts to create 3-dim sparse array (#8877)
- Disable contiguous_check for COO/dense matmul test (#8878)...
Assets 27
v13.4.1
6e3c9b7
Compare
This is the release note of v13.4.1. This is a hot-fix release addressing several issues including DLPack compatibility with existing user code. See here for the complete list of solved issues and merged PRs.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
π Changes
Bug Fixes
- Revert
toDlpack()
default to the old unversioned one (#9011) - Hot fix for numpy 2 support in some fusion paths (#9016)
- Fix compilation error of
cupy.inf
in fusion2 (#9044)
Tests
- CI: Pin setuptools version on Windows (#9047)
Others
- Bump version to v13.4.1 (#9051)
π₯ Contributors
The CuPy Team would like to thank all those who contributed to this release!
Assets 33
v13.4.0
fca48bc
Compare
This is the release note of v13.4.0. See here for the complete list of solved issues and merged PRs.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
β¨ Highlights
NVIDIA CUDA 12.8 Support
CuPy now supports CUDA 12.8 and the latest NVIDIA Blackwell architecture.
AMD ROCm 6.x Support
CuPy can now be built with AMD ROCm 6.x.
Python 3.13 Support
Binary packages for Python 3.13 are now available.
π οΈ Changes without compatibility
Cython 3.0 as build requirement (#8959)
To provide support for Python 3.13, CuPy codebase has been updated for Cython 3. To build CuPy from source, Cython 3.0 or later is now required instead of Cython 0.29.x.
π Changes
New Features
- Add
cupyx.signal.mvdr
(#8872)
Enhancements
- Support ROCm 6 (#8608)
- Support setuptools 74.0.0 or later (#8649)
- Use custom less instead of specializing thrust (#8653)
- Add
NCCL_ERROR_REMOTE_ERROR
to the set of errors from NCCL (#8667) - Replace
numpy.ComplexWarning
withcupy.exceptions.ComplexWarning
(#8678) - Use weakref.finalize instead of del for RandomState._generator destruction (#8680)
- Implement dlpack v1 (#8722)
- Fix some NumPy 2.x CI failures (cont.) (#8725)
- Bump CUDA version in cuda11x-cuda-python CI (#8743)
- ROCm 6.2.2: Conditionally define CUDA_SUCCESS only if it's not (#8799)
- Raise VisibleDeprecationWarning for wavelet functions (#8868)
- Use a custom Min/Max instead of specializing CUB (#8875)
- Updating pylibraft
pairwise_distance
to cuvs (#8897) - Support CUDA 12.8 + Blackwell GPUs (sm_100, sm_120) (#8915)
- Interpolate: update RBF to scipy 1.13 (#8939)
- Use C++17 in JIT compile (#8941)
- Bump library installers for CUDA 12.8 (#8943)
- Use CCCL 2.8.x branch + Use
CUPY_CACHE_KEY
in hash keys (#8946) - Use NVIDIA CCCL 2.8 latest w/CUDA 12.3 fix (#8948)
- Broaden usage of C++17 (#8958)
- Port to Cython 3.0 (#8959)
cupyx.scipy.distance
: initialize output array with empty instead of zeros (#8981)cupyx.scipy.spatial.distance.cdist
remove explicit zeroing of user-provided output array (#8990)- Skip
sparse.linalg.{cg, cgs, gmres}
tests for scipy>=1.14 (#8551) cupyx.scipy.sparse tests
for SciPy 1.14 (#8552)- Fix some NumPy 2.x CI failures (cupyx) (#8738)
- Fix
cupy.percentile
for NumPy 2.x (#8752) - Skip some tests incompatible with NumPy 2.2 (#8830)
- Disable contiguous_check for COO/dense matmul test (#8888)
- Raise ValueError upon attempts to create 3-dim sparse array (#8889)
- Skip a test for invalid scipy return value of invalid COO matmul (#8890)
- Fix
fft.fht
following bug fix in SciPy 1.15 (#8891) - Support empty tuple indexing for sparse matrix (#8892)
- Deprecate
cupyx.scipy.linalg.kron
(#8902) - Fix test for
special.sph_harm
to ignore DeprecationWarning (#8906)
Bug Fixes
- Add nccl.broadcast 64-bit support (#8566)
- Support building CuPy with setuptools 74 (#8577)
- Fix order 'K' with shape given for
*_like
array creation (#8605) hipPointerGetAttributes
returns error when pointer is unregistered in ROCm 5.7 (#8609)- Guard for ROCm 6.x (#8611)
- Fix
HIP_VERSION
unit (#8619) - Switch to using platform.machine() instead of platform.processor() (#8656)
- Properly allocate in RNG when specified dtype is neither float32/float64 (#8658)
- Use
platform.machine()
instead ofplatform.processor()
(#8673) - Fix sosfilt state output shape when ndim < 2 (#8679)
- Fix undefined inf/nan constant in CuPy JIT (#8712)
- Fix bspline kernel to avoid out of bounds error (#8763)
- Fix race during SoftLink initialization (#8787)
- fix nanargmin and nanargmax's parameter order and pass optional parameters (#8791)
- Fix crashes of quantile and percentile (#8811)
- Fix handling of pinned memory (#8852)
- Use
/bigobj
on Windows build (#8967) - Fix
cupyx.scipy.spatial.distance
'scdist
for RAPIDS 24.12 compatibility (#8975)
Code Fixes
- Upgrade pre-commit hooks to silence warnings (#8666)
- Resolve import loop (#8714)
- Resolve uncaught type warning (#8798)
- Switch from
.A
attribute to.toarray()
method (#8814) - Fix typo in
_cretate_frame_tree
(#8944) - Drop unneeded
bytes
copy ofCUPY_CACHE_KEY
(#8947)
Documentation
- Add docs about CUDA headers (#8595)
- Update
fft.rst
(#8617) - Update documentation to use
pre-commit
(#8650) - Add tips on Windows development in Contribution Guide (#8704)
- Add notice about
cupy.array_api
removal (#8751) - Add CUDA 12.8 to docs (#8968)
- Update list of supported versions (#8991)
Installation
- Update conda-build CUDA detection logic for Setuptools 72.2.0 (#8652)
- Use relative path of header files to generate cache key (#8930)
- Fix minimum CUDA version check and update comments (#8938)
- Bump version to v13.4.0 (#8993)
Tests
- Relax
test_firls
atol (#8522) - Skip test_homomorphic in scipy>=1.14 (#8523)
- Skip betaincinv test with SciPy 1.14.1 (#8553)
- Skip special tests for SciPy 1.14 dtype rule changes (#8554)
- Skip
special.logsumexp
test for empty input (#8555) - Skip
cupy.scipy.stats.entropy
tessts for SciPy 1.14 dtype rule change (#8556) - Use
setuptools==73.0.1
(#8569) - Revert CI timeout bump (#8571)
- Support SciPy 1.13 and 1.14 (#8572)
- Missing backport for sparse_array.A removal (#8573)
- Skip test_log_expit SciPy 1.7 (#8576)
- Catch
ValueError
(#8625) - Use
testing.with_requires
to skip broken tests (#8627) - CI: Update micro versions of Python (#8635)
- Skip tests if
scipy
is not installed (#8637) - Accept
OverflowError
inTestCopytoFromScalar
for NumPy v2 (#8643) - Skip more tests if scipy is not installed (#8645)
- Update precommit (#8663)
- Backport the changes introduced in #8690 (#8694)
- CI: Fix apt repository URL for Ubuntu 22.04 (#8715)
- Remove ndarray.ptp from fallback tests (#8744)
- Temporary skip for NumPy 2.0 tests (#8745)
- Relax tolerance of
test_hilbert
for NumPy 2.0 (#8746) - Bump SciPy version to 1.14 in Windows CI (#8764)
- Add NumPy 2.x CI for Linux (#8768)
- CI: support "skip-ci" label (#8841)
- CI: Fix FlexCI compatibility (#8842)
- Add NumPy 2.2 to CI (#8855)
- Replace
flake8
withruff
(#8859) - Support Optuna 4 (#8863)
- Add
testing.shaped_linspace
(#8900) - Disable contiguous_check for some
signal.cont2discrete
tests (#8901) - Fix splines tests to remove unexpected skips (#8921)
- Minor updates for sm120 (#8922)
- Add CI for CUDA 12.8 (#8951)
- Increase host memory in Windows CI, free GPU memory in example code (#8969)
- Skip some signal tests for TypeError for inputs of
np.longlong
dtype (#8972) - Add CI for Python 3.13 and mpi4py v4 (#8974)
- Pass
locals
dict toexec
(#8985)
Others
- Add backport reminder (#8684)
- Fix script name of backport reminder (#8686)
- Update
pre-commit
hooks (#8910) - Fix pull request project board workflows (#8929)
- Regenerate coverage matrix (#8960)
π₯ Contributors
The CuPy Team would like to thank all those who contributed to this release!
@99991 @andfoy @asi1024 @Azusachan @bernhardmgruber @Berrysoft @chainer-ci @cjnolet @dagardner-nv @EarlMilktea @eltociear @ev-br @grlee77 @HollowMan6 @jakirkham @jemiryguo @kmaehashi @leofang @littlewu2508 @mohitreddy1996 @mroeschke @seberg @takagi
Assets 33
v13.3.0
118ade4
Compare
This is the release note of v13.3.0. See here for the complete list of solved issues and merged PRs.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
β¨ Highlights
Updated NVIDIA CCCL
The CCCL library bundled with CuPy has been updated to eliminate the Jitify preprocess phase. Users will no longer see the one-time performance warning (Jitify is performing a one-time only warm-up to populate the persistent cache, this may take a few seconds and will be improved in a future release...
) unless explicitly requesting the use of Jitify (e.g., cupy.RawModule(..., jitify=True)
).
Enhanced NumPy 2.0 Compatibility
This release provides better interoperability with NumPy 2.0.
Support for CUDA 12.5 & 12.6
CuPy is now tested with CUDA 12.5 and 12.6.
RFC: Removing NumPy Fallback Mode in CuPy v14
The CuPy team is discussing the possibility of removing NumPy fallback feature in CuPy v14. Feel free to join the discussion in #8497 if you have any comments or use-cases using this feature.
π Changes
Enhancements
- Support CUDA 12.5 (#8423)
- Avoid using Jitify everywhere inside CuPy (#8473)
- Disable jitify for cub & Bump CCCL (#8487)
- Get rid of
pkg_resources
(#8496) - Unregister
cupyx.scipy.linalg.{tri,tril,triu}
from uarray (reverted in #8516) (#8506) - Use
.toarray()
instead of.A
attribute (#8517) - Extend runtime header search logic to conda (#8520)
- Support CUDA 12.6 (#8524)
- Fallback to system headers for future CUDA 12.x versions (#8529)
Bug Fixes
- Fix spline temp container size in
make_interp_spline
(#8390) - MAINT: Avoid using
np.compat.integer_types
(#8413) - Fix type dispatcher for arm64 (#8414)
- Fix
ndarray.get()
not honoring current stream when layout is not contiguous (#8418) - Fix copyto for NumPy 2 compatibility (#8435)
- Update
compiler.py
to avoid the popup of thenvcc.exe
console (#8438) - Fix
RandomState.seed()
for NumPy 2 compatibility (#8439) - Fix the size of temporary CUB output space to consider its alignment (#8447)
- Address
KeyErrors
fromimportlib_metadata
(#8465) - upfirdn:
mode=None
->mode="constant"
(#8495) - Search header files from CTK wheel (#8504)
- Fix CUDA version condition to use headers from wheel (#8507)
- Do not unregister
cupyx.scipy.linalg.{tri,tril,triu}
from uarray (#8516) - Fix ROCm 4.3 binary package build broken (#8534)
- Fix cudart header detection for conda (#8535)
Documentation
- eigsh doc correction
_eigen.py
(#8383) - typo:
coping
->copying
(#8427) - Add CUDA 12.5 to list of supported platform (#8428)
- Add comparison table for
(cupyx.)scipy.sparse.*_matrix classes
class methods (#8458)
Installation
- Patch the build system to better support conda-build (#8464)
Tests
- Bump NumPy/SciPy versions in cuda-example CI (#8420)
- Support SciPy 1.12 (#8422)
- Fix CUDA 11.2 CI failure on Linux (#8437)
- Decrease number of threads to avoid "system error: excessive memory usage is detected" (#8462)
- CI: skip CUDA 12.1/12.2/12.3/12.4 CI on "mini" trigger (#8469)
- Resolve Ruff
NPY
errors - fix exception imports andasfarray
usage in test code (#8471) - Skip some tests in aarch64 CI (#8490)
π₯ Contributors
The CuPy Team would like to thank all those who contributed to this release!
@andfoy @arkdong @asi1024 @bmerry @EarlMilktea @emcastillo @hmaarrfk @jakirkham @johnnynunez @kmaehashi @leofang @monzelr @seberg @swelborn @takagi @YanivDorGalron
Assets 35
v13.2.0
b127fb1
Compare
This is the release note of v13.2.0. See here for the complete list of solved issues and merged PRs.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
β¨ Highlights
Support for NumPy 2.0 (#8357)
CuPy can now be imported under NumPy 2.0.
Lazily preloading NCCL (#8367)
CuPy now loads NCCL shared library at the time of import cupy.cuda.nccl
, instead of import cupy
. This improves NCCL compatibility on mixed-library environments.
π Changes
Enhancements
- cupyx: cleanup use of deprecated NumPy functionality (NumPy 2.0 compatibility) (#8325)
- make CuPy import under NumPy 2.0 (#8357)
- Lazy-preload NCCL (#8367)
Bug Fixes
- Fix overflow indexing ndarray generated with as_strided (#8349)
- Fix CUB build error on win-64 (#8358)
- Re-enable NVTX range coloring for NVTX3. (#8361)
Documentation
Tests
- [v13] Use the latest NumPy v1 for head CI (#8355)
Others
π₯ Contributors
The CuPy Team would like to thank all those who contributed to this release!
@asi1024 @cclauss @ev-br @grlee77 @kmaehashi @leofang @macrocosme @romerojosh @takagi
Assets 35
v13.1.0
4c9821b
Compare
This is the release note of v13.1.0. See here for the complete list of solved issues and merged PRs.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
β¨ Highlights
Support for CUDA 12.3 & 12.4 (#8286)
CuPy now supports CUDA 12.3 and 12.4. Binary packages are available for Linux (x86_64/aarch64) and Windows as cupy-cuda12x
.
Fixed Regression on pre-Volta platforms (#8216)
This release fixes the regression in CuPy v13.0.0 that part of CuPy functions were not functioning under pre-Volta platforms (compute capability < 7.0) such as NVIDIA Tesla P100 or GeForce GTX 1080.
π Changes
New Features
- Add
cupyx.signal.{complex_cepstrum,real_cepstrum,inverse_complex_cepstrum,minimum_phase}
(#8096) - Add
cupyx.signal.{firfilter,firfilter_zi,firfilter2}
(#8107) - Add
cupyx.signal.freq_shift
(#8131) - Add
cupyx.signal.channelize_poly
(#8148) - Add
cupyx.signal.ca_cfar
(#8167)
Enhancements
- Add incontiguous support for cutensor functions (#8168)
- Remove usages of
numpy.float_
andnumpy.complex_
(#8181) - Fix
expm(complex matrix)
(#8214) - Various Jitify improvements (#8237)
- Bump to cuTENSOR 2.0.1 (#8291)
NumPy-compatibility Improvements
- Fix
scp.signal.{medfilt,medfilt2d}
to raise ValueError for complex64 inputs (#8084) - Fix
boxcox_llf
for SciPy 1.12 changes (#8132) - Deprecate
cupyx.scipy
wavelet functions (#8139)
Bug Fixes
- Fix #7981, Update
_nccl_comm.py
(#8112) - Fix Flags not to allow setters (#8138)
- Prevent angular brackets from appearing in Jitify's cache filename (#8160)
- Set
-arch
in the compiler options unconditionally (#8161) - Allow
cupy.show_config()
without CUDA (#8192) - Fix jitify warmup kernel (#8216)
- Fix: remove unnecessary include that causes deployment issue (#8217)
- Fix build system for Thrust detection (#8230)
- Fix: always switch to the submodule dir before checking git tag/commit (#8240)
- Fix overflow of index calculation in random generator API (#8246)
- Fix Generator API parallelism (#8247)
- Fix CUB
min
/max
initial values (#8266) - Fix jitify warmup kernel - Cont'd (#8270)
Documentation
- Update conda installation guide (#8135)
- Fix pdist docstring in order to specify that the returned matrix is condensed (#8187)
- Replace license notice in cupyx.scipy.signal._spectral (#8271)
- Update document for CUDA 12.3 and 12.4 (#8284)
Installation
- Do not search for static libs (#8143)
Tests
- Fix
cupyx.scipy.special.betainc
for invalid inputs (#8098) - Revert CI timeout changes (#8137)
- Fix invalid
vectorstength
tests (#8145) - Fix actions versions used in workflows to avoid node 16 deprecation warning (#8194)
- Add CI to test
cupy.show_config()
pass without CUDA installed (#8195) - Add import test without CUDA Toolkit (#8231)
- BUG: cupyx/scipy/signal: fix mpmath test (#8262)
- Tentatively pin SciPy to v1.12 in CI (#8275)
- Add support for CUDA 12.3 & 12.4 (#8286)
π₯ Contributors
The CuPy Team would like to thank all those who contributed to this release!
@andfoy @asi1024 @emcastillo @ev-br @jemiryguo @kmaehashi @leofang @takagi
Assets 35
v13.0.0
1bdaf31
Compare
This is the release note of v13.0.0. See here for the complete list of solved issues and merged PRs.
This release note only covers changes made since the v13.0.0rc1 release. Check out our blog for highlights of the v13 release!
See the Upgrade Guide for the list of possible breaking changes in v13.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
π Changes
For all changes in v13, please refer to the release notes of the pre-releases (alpha1, beta1, rc1).
New Features
- Add
cupyx.signal.pulse_compression
from cuSignal's non SciPy-compat API (#8039) - Add
cupyx.signal.convolve1d3o
from cuSignal's non SciPy-compat API (#8067) - add
cupyx.signal.{pulse_doppler, cfar_alpha}
(#8069) - Add
cupyx.signal.convolve1d2o
(#8113)
Enhancements
- Make
cupyx.signal.radartools
private (#8053) - Fix
csrmatrix.__pow__
to raise ValueError for non-int other (#8085)
Performance Improvements
- Speed up cupy environment duplicate detection (#8042)
Bug Fixes
- Fix
lfilter_zi
andsosfilt_zi
when any IIR coefficient is zero (#8036) - Fix
argmax/argmin
for large reduction axis (#8041) - Fix
cupyx.scipy.fft.{dst,dstn}
in type 2/3 (#8082) - Do not use
from-import
(#8114)
Code Fixes
Documentation
- Generate signature for ufunc documentation (#8044)
- Use modern dlpack interface in torch interoperability document (#8048)
Installation
Tests
- Bump stable branch to v13 (#8026)
- Remove some
signal.vectorstrength
xfail tests (#8083) - Fix
scipy.linalg
not to raise DeprecationWarning for zero-size inputs (#8086) scipy.special.{btdtr,btdtri}
are deprecated since SciPy (#8094)- Refactor radartools tests (#8099)
- Fix slow test (#8117)
π₯ Contributors
@andfoy @asi1024 @emcastillo @hauntsaninja @kmaehashi @takagi
The CuPy Team would like to thank all those who contributed to this release!
Assets 35
v13.0.0rc1
4c06bf7
Compare
This is the release note of v13.0.0rc1. See here for the complete list of solved issues and merged PRs.
This is a release candidate of the CuPy v13 series. Please start testing your workload with this release to prepare for the final v13 release. To install: pip install -U --pre cupy-cuda11x -f https://pip.cupy.dev/pre
. See the Upgrade Guide for the list of possible breaking changes in v13.
π¬ Join the Matrix chat to talk with developers and users and ask quick questions!
π Help us sustain the project by sponsoring CuPy!
β¨ Highlights
NVIDIA cuTENSOR 2.0
NVIDIA cuTENSOR is a performant and flexible library for accelerating tensor linear algebra. CuPy v13 supports cuTENSOR 2.0, the latest major release of the library, achieving higher performance than cuTENSOR 1.x series.
NVIDIA RAPIDS cuSignal Integration
cuSignal is a library developed by the NVIDIA RAPIDS project that provides GPU-accelerated implementation of signal processing algorithms using CuPy as a backend. cuSignal includes scipy.signal
compatible APIs, so we share the same goals. After a discussion with the cuSignal team, we agreed to merge cuSignal into CuPy to provide users with a better experience using a unified library for SciPy routines on GPU.
Currently, most of the functions provided in cuSignal have been merged into CuPy, and the remaining items are expected to be merged into CuPy v13 in due course.
We would like to acknowledge and thank @awthomp and everyone involved in the cuSignal development for creating a great library and agreeing to this transition.
Distributed NDArray (experimental) (#7881)
Added initial support for sharding ndarray
s across multiple GPU devices connected to the same host.
from cupyx.distributed.array import distributed_array
shape = (16, 16)
cpu_array = numpy.random.rand(*shape)
# Set the chunk indexes for each device
# device 0 holds rows 0..8 and device 1 holds rows 8..16
mapping = {
0: [(slice(8), slice(None, None))],
1: [(slice(8, None), slice(None, None))],
}
# The array is allocated in devices 0 and 1
multi_gpu_array = distributed_array(cpu_array, mapping)
This work was done by @shino16 during the Preferred Networks 2023 summer internship.
Support for Python 3.12
Binary packages are now available for Python 3.12.
π οΈ Changes without compatibility
CUDA Runtime API is now statically linked
CuPy is now shipped with CUDA Runtime statically linked. Due to this, cupy.cuda.runtime.runtimeGetVersion()
always returns the version of CUDA Runtime that CuPy is built with, regardless of the version of CUDA Runtime installed locally. If you need to retrieve the version of CUDA Runtime shared library installed locally, use cupy.cuda.get_local_runtime_version()
instead.
π Changes
New Features
- Port
lombscargle
from cuSignal tocupyx.scipy.signal
(#7563) - Port
periodogram
,welch
andcsd
from cuSignal tocupyx.signal
(#7564) - Port
cusignal
windows module tocupyx.scipy.signal
(#7568) - Add
cupy.lib.stride_tricks.sliding_window_view
(#7575) cupyx/scipy/signal
: add place poles (#7666)- Add
check_{NOLA, COLA}
tocupyx.scipy.signal
(#7675) - Port
argrel{extrema, max, min}
tocupyx.scipy.signal from cusignal
(#7694) - Port
waveforms
from cusignal tocupyx.scipy.signal
(#7696) - Port
wavelets
module from cusignal tocupyx.scipy.signal
(#7700) - Add 2D signal b-splines to
cupyx.scipy.signal
(#7721) - Port
firwin/firwin2
from cuSignal (#7722) - port
upfirdn
from cuSignal (#7749) - Support boolean COO sparse matrix (#7764)
- Port
gauss_spline
from cuSignal (#7837) - Port
stft/istft
from CuSignal tocupyx.scipy.signal
(#7838) - Port
vectorstrength
,coherence
andspectrogram
from CuSignal tocupyx.scipy.signal
(#7853) - Port
decimate
,resample
andresample_poly
from cuSignal tocupyx.scipy.signal
(#7855) - Add
max_len_seq
tocupyx.scipy.signal
(#7867) - Add distributed ndarray (#7942)
Enhancements
- Implement axis parameter on cupy.unique (#6886)
- Load cuTENSOR from wheel distribution (#7025)
- Soft link NVRTC for
cupy_backends.cuda.libs.nvrtc
(#7621) - Add a property to get access to the nccl handle. (#7823)
- Remove
cusolver_enabled
,cub_enabled
,thrust_enabled
flags (#7840) - Lazy import cuSOLVER (#7843)
- Lazy import cuSPARSE (#7847)
- Lazy import cuFFT (#7849)
- Static link to CUDA Runtime in CUB module (#7850)
- Bundle CCCL in CuPy (#7851)
- Lazy import cuRAND (#7856)
- Use NVRTC for compiling kernels calling
cupyx.jit.cub
APIs (#7869) - Add optional argument
device_id=-1
toget_current_stream
(#7885) - Prohibit conversion from Variable to Python scalar in fusion (#7887)
- Add
__slots__
tocupy.ndarray
(#7891) - Lazy import cuBLAS (#7921)
- Allow Jitify to only cache CuPy-owned headers (#7934)
- Ensure D2H copies are stream ordered and by default blocking (#7938)
- Accelerate H2D copies when the source is on pinned memory (#7939)
- Add Linux CI for Python 3.12 (#7940)
- MNT: Suppress CUB compilation warnings (#7943)
- Static link CUDA Runtime (#7954)
- Add debug feature to preloading and softlink (#7977)
- Support cuTensor 2.0 (#7984)
- Bump supported NumPy & SciPy versions (#7992)
- Softlink CUDA Driver (#7994)
- Show local runtime version in
cupy.show_config()
(#7995) - Avoid using
numpy.find_common_type
(#7651) - ENH: Remove
NINF
,PINF
,Inf
,... usages (#7800) - Fix
cupy.empty_like
parameter name toprototype
(#7827) - Make
stream
kwonly argument inndarray.__dlpack__
(#7829) - Remove conversions of array with ndim > 0 to a scalar (#7886)
scipy.linalg.{tri/tril/triu}
are deprecated in SciPy 1.11.0 (#7889)- Fix
signal.medfilt
complex error type for SciPy>=1.11 (#7890) - Fix
cupyx.scipy.sparse._base
tests for SciPy 1.11 (#7905) - Fix return type of division of csr_matrix and dense array for SciPy 1.11 (#7906)
- Fix
maxiter
inTestLOBPCG
(#7908)
Performance Improvements
- Optimize
spmatrix._set_many
(#7888)
Bug Fixes
- Fix csr2dense to avoid race conditions (#7724)
- Fix cuTENSOR contraction descriptor cache (#7814)
- Fix handling of scalars in cupy.r_ (#7815)
- Fix
cupy.r_
for scalar inputs (#7896) - Fixed Improper Method Call: Replaced
NotImplementedError
withNotImplemented
(#7900) - Provide .stop() method for cupyx.distributed._Backend (#7952)
- Fix
NVRTCError
not callinginitialize()
(#7955) - Import cupyx.lapack inside cupy.linalg.solve (#7966)
- Add lazy load for
cupyx.lapack
(#7993) - Fix issues with the initial state when a SOS filter has no IIR part (#7998)
- Avoid using
pkg_resources
for cuTENSOR wheel discovery (#8012)
Code Fixes
- MNT: suppress compiler warning from
cupyx.cusolver
(#7714) - Add type annotation in _creation.basic (#7739)
- Fix nvrtc initialize not inlined for CUDA Python (#7842)
- Fix coding style (#7844)
- Reorganize directory structure around CCCL (#7920)
- Remove deprecated ast expr in CuPy JIT (#7941)
- Reorganize third party code under
third_party
directory (#7956)
Documentation
- Add
-U
to pre-release installation command (#7803) - Fix
get_window
docstring reference (#7835) - Clarify sparse .transpose() return type in docstrings (#7868)
- DOC: cupyx/scipy: add missing names (#7898)
- Fix CUDA 12.2 for Windows notice (#7922)
- Bump CuPy version in install.rst (#8002)
- Update installation guide to note about cuTENSOR 2.0 support (#8003)
- Update wheels list in README (#8006)
Installation
- Avoid warning when uploading packages (#7792)
- Fix ROCm Dockerfile not working (#7797)
- Add cuSignal license (#7816)
- Improve symlink handling and preflight (#7945)
- Bump docker cuda version to 12 (#7973)
Tests
- Add timeout to Windows CI (#7775)
- Fix mypy not installed in pre-review test (#7832)
- Execution tests for typing tests passing rows in
typing_tests
(#7836) - CI: Remove path length limitation on Windows CI image (#7857)
- Fix Windows CI failures (#7862)
- Skip test_pos_boolarray if numpy>=1.25 (#7893)
- Add NumPy 1.25/1.26 & SciPy 1.11 to CI (#7897)
- Skip some LOBPCG tests failing with SciPy 1.11 (#7924)
- Support Python 3.12, add Windows CI (#7947)
- Skip logspace test in NumPy 1.25 & 1.26 (#7946) (#7948)
- Fix Windows test scripts (#7957)
- Skip
test_parameterize_pytest_impl
test for pytest 7.4.3 (#7965) - Fix
TestLOBPCG.test_maxit_None
CUDA 12.2 CI failure (#8000)
Others
- Fix publish workflow permission and output for review (#7788)
- Fix backport workflow (#7831)
- Avoid triggering Project Updates for updates from assignees (#7861)
- Bump version to v13.0.0rc1 (#8015)
π₯ Contributors
The CuPy Team would like to thank all those who contributed to this release!
@anaruse @andfoy @asi1024 @emcastillo @ev-br @fazledyn-or @kerry-vorticity @kmaehashi @leofang @loganbvh @milesvant @mtsokol @mvnvidia @negin513 @shino16 @takagi