CARVIEW |
Select Language
HTTP/2 200
date: Sun, 20 Jul 2025 15:43:53 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
x-repository-download: git clone https://github.com/pytorch/pytorch.git
etag: W/"c2534c11aa22d1c3113fa1f14bf0dfce"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: no-referrer-when-downgrade
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
set-cookie: _gh_sess=YxUVttGhaXWXTmFp7I%2F%2BzpRadEw3M2DRzayxc1o7DYThczTSiaqe%2BDTRmwRaoVNcIxorGNTUvuxq5QiHH7I5XZjGp1SjtDWnJ6ByuaLJQFf4wnhz45xlhn86jhq1M9Qd5ynSoIbQpGAAXjEycvGJstsp4IEcfnt7hD9i6T9IWW%2B4yzJvOm%2F8nxrYbZHCSIxo%2Forh6mIXg27YJaGQ8qEVOXo5TPQO%2FK8E1pDCJot706gUSdWDZKEc6tWytSjKuTYCuek0IjuLvh32OyZ1AZ4PHg%3D%3D--Bu97DhOZo9snmXYA--08Ljp2L7mob4JbYFnguuNQ%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.639467124.1753026232; Path=/; Domain=github.com; Expires=Mon, 20 Jul 2026 15:43:52 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Mon, 20 Jul 2026 15:43:52 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: BFFC:134C64:11F8FBA:1774824:687D0EB8
[SymmetricMemory] introduce a binding for cuMemset32Async (#138755) · pytorch/pytorch@ee42a99 · GitHub
Copy file name to clipboardExpand all lines: test/distributed/test_symmetric_memory.py
Copy file name to clipboardExpand all lines: torch/csrc/distributed/c10d/CUDASymmetricMemoryOps.cu
Copy file name to clipboardExpand all lines: torch/csrc/distributed/c10d/init.cpp
Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 24.7k
Commit ee42a99
[SymmetricMemory] introduce a binding for cuMemset32Async (#138755)
## This Stack
This stack does the following things to support `xformers`-style, comm-aware Triton kernels:
- Exposes `signal_pad`s as tensors in Python
- Adds a binding for `cuMemsetAsync`
These in combination aims to provide users with more flexibility to express custom signaling/synchronization patterns.
## This PR
Make `cuMemset32Async` available via `_SymmetricMemory.memset32`. We chose `cuMemset32Async` over `cudaMemsetAsync` because it allows for `uint32_t`-wise memset. This provides users with better flexibility.
To enable this, we also added the following cuda driver APIs in `c10::cuda::DriverAPI`:
- `cuDevicePrimaryCtxRetain` - for obtaining the primary context of a device in the form of `CUcontext`.
- `cuCtxGetCurrent`/`cuCtxSetCurrent` - for setting and restoring the context for cuda driver APIs such as `cuMemset32Async`.
Pull Request resolved: #138755
Approved by: https://github.com/weifengpy, https://github.com/eqy, https://github.com/lw1 parent 87059d4 commit ee42a99Copy full SHA for ee42a99
File tree
Expand file treeCollapse file tree
6 files changed
+121
-13
lines changedFilter options
- c10/cuda
- caffe2
- test/distributed
- torch/csrc/distributed/c10d
Expand file treeCollapse file tree
6 files changed
+121
-13
lines changed-2Lines changed: 0 additions & 2 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
70 | 70 |
| |
71 | 71 |
| |
72 | 72 |
| |
73 |
| - | |
74 |
| - | |
75 | 73 |
| |
76 | 74 |
| |
77 | 75 |
| |
|
+1Lines changed: 1 addition & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
30 | 30 |
| |
31 | 31 |
| |
32 | 32 |
| |
| 33 | + | |
33 | 34 |
| |
34 | 35 |
| |
35 | 36 |
| |
|
+1Lines changed: 1 addition & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
562 | 562 |
| |
563 | 563 |
| |
564 | 564 |
| |
| 565 | + | |
565 | 566 |
| |
566 | 567 |
| |
567 | 568 |
| |
|
test/distributed/test_symmetric_memory.py
Copy file name to clipboardExpand all lines: test/distributed/test_symmetric_memory.py+29Lines changed: 29 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
24 | 24 |
| |
25 | 25 |
| |
26 | 26 |
| |
| 27 | + | |
27 | 28 |
| |
28 | 29 |
| |
29 | 30 |
| |
| 31 | + | |
30 | 32 |
| |
31 | 33 |
| |
32 | 34 |
| |
| |||
849 | 851 |
| |
850 | 852 |
| |
851 | 853 |
| |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
852 | 881 |
| |
853 | 882 |
|
torch/csrc/distributed/c10d/CUDASymmetricMemoryOps.cu
Copy file name to clipboardExpand all lines: torch/csrc/distributed/c10d/CUDASymmetricMemoryOps.cu+73-10Lines changed: 73 additions & 10 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
1 |
| - | |
2 |
| - | |
3 | 1 |
| |
4 | 2 |
| |
5 | 3 |
| |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
6 | 12 |
| |
7 | 13 |
| |
8 | 14 |
| |
| |||
11 | 17 |
| |
12 | 18 |
| |
13 | 19 |
| |
14 |
| - | |
15 |
| - | |
16 | 20 |
| |
17 | 21 |
| |
18 | 22 |
| |
| |||
491 | 495 |
| |
492 | 496 |
| |
493 | 497 |
| |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
494 | 551 |
| |
| 552 | + | |
495 | 553 |
| |
496 | 554 |
| |
497 | 555 |
| |
| |||
519 | 577 |
| |
520 | 578 |
| |
521 | 579 |
| |
522 |
| - | |
523 |
| - | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
524 | 586 |
| |
525 | 587 |
| |
526 | 588 |
| |
| |||
531 | 593 |
| |
532 | 594 |
| |
533 | 595 |
| |
534 |
| - | |
535 |
| - | |
536 |
| - | |
537 |
| - | |
538 | 596 |
| |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + |
torch/csrc/distributed/c10d/init.cpp
Copy file name to clipboardExpand all lines: torch/csrc/distributed/c10d/init.cpp+17-1Lines changed: 17 additions & 1 deletion
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
1122 | 1122 |
| |
1123 | 1123 |
| |
1124 | 1124 |
| |
1125 |
| - | |
| 1125 | + | |
| 1126 | + | |
| 1127 | + | |
| 1128 | + | |
| 1129 | + | |
| 1130 | + | |
| 1131 | + | |
| 1132 | + | |
| 1133 | + | |
| 1134 | + | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
1126 | 1142 |
| |
1127 | 1143 |
| |
1128 | 1144 |
| |
|
You can’t perform that action at this time.
0 commit comments