HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 date: Sat, 19 Jul 2025 11:33:38 GMT content-type: text/html; charset=utf-8 vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With etag: W/"d12aa80f2fb563812b049118635f7983" cache-control: max-age=0, private, must-revalidate strict-transport-security: max-age=31536000; includeSubdomains; preload x-frame-options: deny x-content-type-options: nosniff x-xss-protection: 0 referrer-policy: no-referrer-when-downgrade content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/ server: github.com content-encoding: gzip accept-ranges: bytes set-cookie: _gh_sess=Bg%2FdpFiudi1sM0rlANBMxfKinpX%2FiCJZJDmPb75ypwOqWaTCXbCg4oCB6Aetbujchtm0cvao6xDa8b5raA90%2F%2BhQ0cY%2FTcTLgIByQLNl87B%2BsVOW6nyMGjNRrvw6t%2FFcZ37CjczOT1fWg1iNNPMs9PNVli4h3rs%2FizXPEo1fGbh%2Fkgciv5SR8vRHPnLkEggXvuHn3Pr9AuFk0EauUDfOBVCwIU5FLuN%2FW904tnaU%2BrUpLsOQzDFhOdy6ilUSBVsl4bUBH%2B63bYpmq0eS%2Bm%2BT%2Fg%3D%3D--%2F3rKndZud91vnzl1--gAcihNweYpMCfaxK1hFsvg%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax set-cookie: _octo=GH1.1.1350036645.1752924817; Path=/; Domain=github.com; Expires=Sun, 19 Jul 2026 11:33:37 GMT; Secure; SameSite=Lax set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Sun, 19 Jul 2026 11:33:37 GMT; HttpOnly; Secure; SameSite=Lax x-github-request-id: B5CE:3991AB:43283D:5453B7:687B8291 Tags · pytorch/pytorch · GitHub

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

pytorch / pytorch Public

Notifications You must be signed in to change notification settings
Fork 24.7k
Star 91.6k

Code
Issues 5k+
Pull requests 1.3k
Actions
Projects 12
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Tags: pytorch/pytorch

Releases Tags

trunk/2955acaed6a0f93f1f0913df3f840912392bc2ff

Toggle trunk/2955acaed6a0f93f1f0913df3f840912392bc2ff's commit message

Clean up some unused build env variables (#158599)
* Parameter build-with-debug isn't needed, it isn't even passed into Docker. Debug build is detected via the build environment name
* AWS_DEFAULT_REGION is a leftover from ARC and isn't used anywhere in .ci/pytorch nor .github
Signed-off-by: Huy Do <huydhn@gmail.com>
Pull Request resolved: #158599
Approved by: https://github.com/cyyever, https://github.com/ZainRizvi
ghstack dependencies: #158598

Jul 19, 2025
2955aca
zip
tar.gz

trunk/90b082e207bff79dd09d89cfef9be49de5c2ad83

Toggle trunk/90b082e207bff79dd09d89cfef9be49de5c2ad83's commit message

enable_caching_generated_triton_templates=True by default (#158592)
Got some risk, but good to catch issues if there is any, easy to revert single flag flip.
Pull Request resolved: #158592
Approved by: https://github.com/eellison

Jul 19, 2025
90b082e
zip
tar.gz

trunk/64dabb2cf5c4112c7c169fb76dabe9ab905c8e7c

Toggle trunk/64dabb2cf5c4112c7c169fb76dabe9ab905c8e7c's commit message

only fail regressions>10% on pr_time benchmarks (#158577)
Moving to a new framework, maintaitning the pr_time benchmark test right now is hard and often breaking.
1. only fail PRs >10% regressions.
2. post monitor with pr_time benchmarks dashboard (oncall), and update expected results (frequently or on big changes)
(supposed to already be doing https://www.internalfb.com/unidash/dashboard/pt2_diff_time_metrics)
3. setting up some one detections  detectors warnings that would be triggered at regressions and notify internally post land
https://www.internalfb.com/monitoring/detector/1140915271179237
Pull Request resolved: #158577
Approved by: https://github.com/xmfan, https://github.com/janeyx99

Jul 19, 2025
64dabb2
zip
tar.gz

trunk/22d82222c6e2a2ef4badc6b816d233a4cec924c3

Toggle trunk/22d82222c6e2a2ef4badc6b816d233a4cec924c3's commit message

GenAI Layer Benchmark (#158536)
This PR adds GenAI layer benchmark. It compares pytorch eager, pytorch compiler, liger, and quack.
It covers all kernels supported by [quack](https://github.com/Dao-AILab/quack?tab=readme-ov-file#kernels-) (CrossEntropy Fwd/Bwd, Softmax Fwd/Bwd, RMSNorm Fwd/Bwd, LayerNorm Fwd) and LayerNormBwd.
## Motivations
- Many OSS users asked how to properly benchmark torch.compile generated kernels. One common error is to compile a kernel/layer for one shape (e.g., batch size=1) and benchmark for another shape (e.g., batch size = 1024), which leads to bad performance. This provides an simple & clear example for proper benchmark.
- We recently added GenAI model benchmark (based on [vLLM](https://hud.pytorch.org/benchmark/llms?repoName=vllm-project%2Fvllm)). But it's usually hard to optimize models directly due to complexity. Layer benchmarks are easier to reason and optimize.
## Key Settings
- Avoid reusing a kernel specializing on 1 shape for benchmark on another shape.
```python
torch._dynamo.config.automatic_dynamic_shapes = False
# Needed since changing args to function causes recompiles
torch._dynamo.config.recompile_limit = 1000000
```
- For forward, people may mark batch size as dynamic to avoid runtime recompilation. We respect the setting in this kernel-level benchmark.
```
torch._dynamo.mark_dynamic(x, 0)
```
GPU: H100 (devvm006.dkl0)
Results: [P1874246170](https://www.internalfb.com/phabricator/paste/view/P1874246170)
Note: for numerical accuracy, we use the default tolerance of torch.testing.assert_close (i.e., for `torch.bfloat16`, use rtol `1.6e-2` and atol `1e-5`). It shows numerical issues for some backends and kernels.
Next step is to add roofline analysis, add to ci for checking regression, cover more GenAI Kernels, and include GenAI Layers for common fusion patterns.
<img width="3564" height="2368" alt="CrossEntropyBackward_bench" src="https://github.com/user-attachments/assets/7aa77ad1-83eb-41ea-a27d-50fd5b1dd6be" />
<img width="3564" height="2368" alt="CrossEntropyForward_bench" src="https://github.com/user-attachments/assets/a26ec028-3791-4a41-a12a-05e10f60e9aa" />
<img width="3564" height="2368" alt="LayerNormBackward_bench" src="https://github.com/user-attachments/assets/cc6673ed-c148-4dd2-a729-5f02e717ab3e" />
<img width="3564" height="2368" alt="LayerNormForward_bench" src="https://github.com/user-attachments/assets/f71f9f9d-7b45-4ce7-89d0-e9bce727efae" />
<img width="3564" height="2368" alt="RMSNormBackward_bench" src="https://github.com/user-attachments/assets/e012821a-b7e6-4e83-a24c-c97fa8cd37b5" />
<img width="3564" height="2368" alt="RMSNormForward_bench" src="https://github.com/user-attachments/assets/2d52ee1e-9a8c-4bd1-a180-97b93f07171d" />
<img width="3564" height="2368" alt="SoftmaxBackward_bench" src="https://github.com/user-attachments/assets/02aad056-3ce1-4b40-8cfe-adae81fd017a" />
<img width="3564" height="2368" alt="SoftmaxForward_bench" src="https://github.com/user-attachments/assets/779f6b0d-a102-4164-8300-86fff0329ddf" />
Pull Request resolved: #158536
Approved by: https://github.com/yf225, https://github.com/eellison

Jul 19, 2025
22d8222
zip
tar.gz

trunk/7cc5d03dfc0077bc670c39abd101c72a04b2737f

Toggle trunk/7cc5d03dfc0077bc670c39abd101c72a04b2737f's commit message

Document the rest of the specific optimizer module APIs (#158669)
Pull Request resolved: #158669
Approved by: https://github.com/albanD
ghstack dependencies: #158483

Jul 19, 2025
7cc5d03
zip
tar.gz

trunk/7cc1a9546c135f8e7635e0d38aa2bba797f8907d

Toggle trunk/7cc1a9546c135f8e7635e0d38aa2bba797f8907d's commit message

[AOTI] fix extract file failed on Windows. (#158702)
Changes:
1. rename zip index name, and keep it out of normalize path.
2. normalize output path for extract file.
Extract files successful:
<img width="683" height="247" alt="image" src="https://github.com/user-attachments/assets/72dff7b9-5ec0-4523-a6ee-7768b37bbe63" />
Pull Request resolved: #158702
Approved by: https://github.com/angelayi

Jul 19, 2025
7cc1a95
zip
tar.gz

trunk/5cde34473c33ed7f8df07489783a2b86058ebb3f

Toggle trunk/5cde34473c33ed7f8df07489783a2b86058ebb3f's commit message

Fix `MakeTensor::computeStorageSize()` (#158690)
For tensor with non-zero offset, it must be multiplied by element size
Add regression test by creating Tensor in array of 6 elements with offset 3, which before the fix crashed with
```
C++ exception with description "setStorage: sizes [3, 3], strides [0, 1], storage offset 3, and itemsize 4 requiring a storage size of 24 are out of bounds for storage of size 15
Exception raised from checkInBoundsForStorage at /Users/nshulga/git/pytorch/pytorch/aten/src/ATen/native/Resize.h:123 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) + 56 (0x104a9cd44 in libc10.dylib)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 120 (0x104a9a05c in libc10.dylib)
frame #2: void at::native::checkInBoundsForStorage<long long>(c10::ArrayRef<long long>, c10::ArrayRef<long long>, long long, caffe2::TypeMeta const&, c10::Storage const&) + 656 (0x111dbd314 in libtorch_cpu.dylib)
frame #3: void at::native::setStrided<long long>(at::Tensor const&, c10::ArrayRef<long long>, c10::ArrayRef<long long>, long long) + 152 (0x111dcd22c in libtorch_cpu.dylib)
frame #4: at::native::as_strided_tensorimpl(at::Tensor const&, c10::ArrayRef<long long>, c10::ArrayRef<long long>, std::__1::optional<long long>) + 312 (0x111dccf98 in libtorch_cpu.dylib)
frame #5: c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::SymInt>), &at::(anonymous namespace)::(anonymous namespace)::wrapper_CPU__as_strided(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::SymInt>)>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::SymInt>>>, at::Tensor (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::SymInt>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::SymInt>) + 104 (0x1129a1e94 in libtorch_cpu.dylib)
frame #6: at::_ops::as_strided::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::SymInt>) + 476 (0x112200ad0 in libtorch_cpu.dylib)
frame #7: at::Tensor::as_strided(c10::ArrayRef<long long>, c10::ArrayRef<long long>, std::__1::optional<long long>) const + 236 (0x1115db098 in libtorch_cpu.dylib)
frame #8: at::native::expand(at::Tensor const&, c10::ArrayRef<long long>, bool) + 348 (0x111dcc0d4 in libtorch_cpu.dylib)
frame #9: c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool), &torch::ADInplaceOrView::(anonymous namespace)::expand(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool)>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool>>, at::Tensor (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool) + 116 (0x1157ac410 in libtorch_cpu.dylib)
frame #10: c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool), &torch::autograd::VariableType::(anonymous namespace)::expand(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool)>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool>>, at::Tensor (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool) + 992 (0x114e8b010 in libtorch_cpu.dylib)
frame #11: at::_ops::expand::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>, bool) + 316 (0x112743c90 in libtorch_cpu.dylib)
frame #12: at::expand_size(at::Tensor const&, c10::ArrayRef<long long>) + 164 (0x1047d82b4 in basic)
frame #13: BasicTest_TestForBlobResizeCPU_Test::TestBody() + 284 (0x1047d8048 in basic)
```
Pull Request resolved: #158690
Approved by: https://github.com/angelayi

Jul 19, 2025
5cde344
zip
tar.gz

trunk/5b40f6581eac8a2e92af8dd986df7c22ad4584ce

Toggle trunk/5b40f6581eac8a2e92af8dd986df7c22ad4584ce's commit message

Revert "Add warning about removed sm50 and sm60 arches (#158301)"
This reverts commit fb731fe.
Reverted #158301 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](#158301 (comment)))

Jul 19, 2025
5b40f65
zip
tar.gz

trunk/2c16eb9f3db0ba68520e5832d8bb6d3d875bdaeb

Toggle trunk/2c16eb9f3db0ba68520e5832d8bb6d3d875bdaeb's commit message

[dynamo] Support more basic output types for `nonstrict_trace` (#157969)
Fixes #157397 and improves the user-facing error message for remaining
unsupported cases.
Pull Request resolved: #157969
Approved by: https://github.com/zou3519

Jul 19, 2025
2c16eb9
zip
tar.gz

trunk/fac0be7b9c80f20bbff1e813225dcbced7ff4d31

Toggle trunk/fac0be7b9c80f20bbff1e813225dcbced7ff4d31's commit message

[async-TP] Turn asserts back into silent skips (#158572)
#149946 modified some checks that verify whether async-TP is "applicable" to a given collective operation in a graph. Before, the pattern-mathcing+replacement would just be skipped, but now these are asserts that fail and raise.
This is causing concrete issues in some graphs where 2-dimensional device meshes are being used (e.g., TP + CP) but only one dimension has symm-mem enabled. See #158569.
This PR is turning these asserts back into harmless early-exits. Note that this only needed to be done for reduce-scatters, as it was already the case for all-gathers.
Pull Request resolved: #158572
Approved by: https://github.com/danielvegamyhre, https://github.com/atalman

Jul 19, 2025
fac0be7
zip
tar.gz

PreviousNext

Footer

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source