CARVIEW |
Select Language
HTTP/2 301
date: Tue, 14 Oct 2025 05:14:08 GMT
content-type: text/html; charset=utf-8
content-length: 0
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
location: https://github.com/NVIDIA-NeMo/NeMo/releases
cache-control: no-cache
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com github.githubassets.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com wss://alive-staging.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com marketplace-screenshots.githubusercontent.com/ copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
set-cookie: _gh_sess=OI%2BetXXCfCzmigO7fX1O9PtZIvya6oS5DDaLh2hUYa1OojGoGYHOMpI0ocStliGwSH0jNKsWFUMBzLK0cjvW0YFhPFOxRd9znbGqrc4o3y%2BYikNnVMVCtpbkpUIINmpCBDCKOzx7IbAhkSlO5JFOjLCt3LZ%2BgX4euNCrLjTex%2FaLkFj8lj6U9ESpMVbam8DGc93Wr9Tliz%2Ba02b9Mhb%2F2oOOV7iEQD7Ry9%2Bz2l20me2sNLLHcxHxp7%2Fen%2BSnS0FnmIU%2BlDkbwFGnY%2FXNim1wfQ%3D%3D--nZWdpDMD7F5aBl0s--SCijLaOLYyNhX6taEM5Q2w%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.1027933967.1760418848; Path=/; Domain=github.com; Expires=Wed, 14 Oct 2026 05:14:08 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Wed, 14 Oct 2026 05:14:08 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: 8D20:5317A:36CFF8:49689B:68EDDC1F
HTTP/2 200
date: Tue, 14 Oct 2025 05:14:09 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
etag: W/"46ebf023bc800eb6d36b8aff5529b671"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: no-referrer-when-downgrade
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com github.githubassets.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com wss://alive-staging.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com marketplace-screenshots.githubusercontent.com/ copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
x-github-request-id: 8D20:5317A:36D04F:4968FC:68EDDC20
Releases · NVIDIA-NeMo/NeMo · GitHub
07 Oct 19:58
Loading
07 Oct 00:36
Read more
Loading
30 Sep 17:10
Loading
25 Sep 22:10
Loading
03 Oct 14:36
Loading
03 Aug 16:50
Loading
25 Jul 18:12
Read more
Loading
08 Jul 22:29
Loading
09 Jul 00:37
Loading
02 Jul 20:49
Loading
Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Releases: NVIDIA-NeMo/NeMo
Releases · NVIDIA-NeMo/NeMo
25.09-alpha.rc2
f1ba52e
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
Update lora.py Signed-off-by: Michał Marcinkiewicz <43240942+mmarcinkiewicz@users.noreply.github.com>
Assets 2
NVIDIA Neural Modules 2.5.0
ddcb2d6
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
Highlights
-
Collections:
- LLM
- Nano v2 12B and 9B
- Speech
- New SpeechLM2 collection
- Streaming Softformer model
- Deprecate Confidence Ensemble models
- parakeet-tdt-0.6b-v3 and canary-1b-v2 models
- Added chunk inference support with .transcribe() for canary based models
- Enable prediction of timestamps with streaming ASR
- Improve ASR models’ invariance to padding/batch size
- Qwen prompt format support, SALM generation fixes
- High-level SALM model.generate API closely resembling HF models
- SALM model initialization with time/memory optimization
- SpeechLM2: fixed excessive padding, support on-the-fly resampling for SALM
- LLM
-
Automodel and Export-Deploy functionality are available in their individual repositories respectively and deprecated in NeMo2
Detailed Changelogs:
ASR
Changelog
- Modernize logger interface by @emmanuel-ferdman :: PR: #13783
- Higher-level API for SALM.generate by @pzelasko :: PR: #14034
- add/refactor docs for asr lm customization by @lilithgrigoryan :: PR: #14088
- Improve NEST GPU Utilization 1/N by @MahmoudAshraf97 :: PR: #14086
- Improve ASR models' invariance to padding/batch size by @pzelasko :: PR: #13827
- Clean up transducer decoding initialization by @artbataev :: PR: #14112
- Improve NEST GPU Utilization 2/N by @MahmoudAshraf97 :: PR: #14089
- GPU-accelerated Phrase-Boosting (GPU-PB) for AED decoding by @andrusenkoau :: PR: #14108
- Fix decoding with ngpu-lm when training (#13994) by @hoangtran9122 :: PR: #13995
- fix eval_beamsearch_ngram_ctc script by @lilithgrigoryan :: PR: #14238
- fix wrong typing for ctc-ws context graph by @andrusenkoau :: PR: #14262
- fix frame vad by @stevehuang52 :: PR: #14337
- Improve NEST GPU Utilization 3/N by @MahmoudAshraf97 :: PR: #14234
- remove confidence ensemble models by @lilithgrigoryan :: PR: #14343
- Fix ASR decoding issues with CUDA graphs in training by @artbataev :: PR: #14184
- Streaming Sortformer release PR01: uploading bugfixes, refactored variables and yaml file name changes by @tango4j :: PR: #14416
- Streaming Sortformer release PR02: unit tests for streaming models and modules by @tango4j :: PR: #14417
- GPU-accelerated Phrase-Boosting (GPU-PB) for CTC, RNN-T, and TDT decoding by @andrusenkoau :: PR: #14277
- Fix subsampling chunking test by @monica-sekoyan :: PR: #14452
- Canary2 with NFA by @monica-sekoyan :: PR: #14121
- Initial Chunking by @nune-tadevosyan :: PR: #14321
- Chunking fix by @nune-tadevosyan :: PR: #14482
- Tutorial and doc update by @nune-tadevosyan :: PR: #14484
- Streaming Sortformer release PR03: NeMo documentations and tutorial notebook by @tango4j :: PR: #14388
- Add wget_from_nemo by @nune-tadevosyan :: PR: #14623
- Downgrade "datasets" library version in ASR tutorial to ensure compatibility with HF Datasets used by @KunalDhawan :: PR: #14685
- Canary tutorial fix by @nune-tadevosyan :: PR: #14708
- Force activations and weights cast to FP32 Jasper Encoder Squeeze-Excite by @erastorgueva-nv :: PR: #14715
TTS
Changelog
NLP / NMT
Changelog
- add extra params for MegatronDataSampler by @dimapihtar :: PR: #13956
- Modernize logger interface by @emmanuel-ferdman :: PR: #13783
- remove dialogue collection by @dimapihtar :: PR: #14087
- remove QA collection by @dimapihtar :: PR: #14092
- remove text nlp collection by @dimapihtar :: PR: #14110
- remove nlp modules by @dimapihtar :: PR: #14127
- remove rag collection by @dimapihtar :: PR: #14157
- remove nmt collection by @dimapihtar :: PR: #14191
- Fix importerror in transformer_lm_model after nlp module removals by @chtruong814 :: PR: #14199
- fix QA comments NVBug by @huvunvidia :: PR: #14196
- Temporarily Remove Encoder PP Support by @yaoyu-33 :: PR: #14167
- remove mixins collections by @dimapihtar :: PR: #14281
- feat: print expert groups on megatron init by @clumsy :: PR: #13874
- [speechlm2] [lhotse] sharegpt data and testloader by @huckiyang :: PR: #14294
- Add notebook for LoRA on GPT-OSS-20B by @shashank3959 :: PR: #14439
- Sketch dist-ckpt content versioning by @mikolajblaz :: PR: #13839
- Change to enable full iteration CUDA graph for LLMs by @vasunvidia :: PR: #14077
Text Normalization / Inverse Text Normalization
Changelog
- Check lightning and core imports in install test by @chtruong814 :: PR: #14403
Export
Changelog
- ci: Set L2_NeMo_2_Export_Deploy_Query_In_Framework to be optional by @chtruong814 :: PR: #13946
- Remove old export doc by @oyilmaz-nvidia :: PR: #14292
- Llama4 Export: Remove outdated MLP weight transform by @suiyoubi :: PR: #14297
- Update mllama hf import/export for transformers 4.53 by @meatybobby :: PR: #14327
Bugfixes
Changelog
- Bugfix for Hyena to the get_t function which comes up when doing longer context inference by @jstjohn :: PR: #14256
- fix skipped cuHyena kernel while training by @farhadrgh :: PR: #14365
- Remove flaky Evo2 dataset performance test by @jstjohn :: PR: #14371
- Use module prefix in restore_modelopt_state by @jenchen13 :: PR: #14384
Uncategorized:
Changelog
- Version bump to
2.5.0rc0.dev0
by @github-actions[bot] :: PR: #13944 - [Llama4] Enable tp comm overlap for llama4 by @gdengk :: PR: #13940
- Fix for Squad Dataset Download by @rhmukundan :: PR: #13893
- add nmh HF conversion by @JRD971000 :: PR: #13941
- Speechlm2 SALM improvements by @pzelasko :: PR: #13829
- fix dataset issue by @dimapihtar :: PR: #13953
- Editing MMLU to pull from the correct repo by @ruchaa-apte :: PR: #13991
- move classes to module to use target feature (#14023) by @nithinraok :: PR: #14031
- Add Nemotron-H prompt format, fix cut-to-conversation custom attr propagation by @pzelasko :: PR: #13963
- Bump release_library template to v0.40.0 by @chtruong814 :: PR: #14046
- [automodel] add support for layer-freezing by @akoumpa :: PR: #14000
- [Qwen3] Recipe config bug fix by @gdengk :: PR: #14084
- Add TE import guard in qwen2vl vision module by @chtruong814 :: PR: #14091
- Update bitsandbytes dependency to v0.46.0 by @pramodk :: PR: #14050
- Update FSDP2 docstring by @BoxiangW :: PR: #14105
- Interface to enable fsdp-double-buffer without enabling NCCL-UB by @youngeunkwon0405 :: PR: #14076
- SpeechLM2 SALM: load ckpt faster, with less GPU memory by @pzelasko :: PR: #14113
- Add object_storage_cache_path to PreTrainingDataModule by @shunjiad :: PR: #14103
- Update changelog for
r2.3.0
by @github-actions[bot] :: PR: #14160 - Fix FLUX test with correct env var by @suiyoubi :: PR: #14149
- add mmap_bin_files param by @dimapihtar :: PR: #14122
- Add option to suppress import checks in
Dockerfile.speech
by @artbataev :: PR: #14185 - Safely import optional python packages by @roclark :: PR: #13936
- Set flux test as optional by @chtruong814 :: PR: #14190
- Revert "Safely import optional python packages (#13936)" by @chtruong814 :: PR: #14197
- Fix "Safely import optional python packages (#13936)" by @chtruong814 :: PR: #14198
- Add fix for evo2 generate/inference by @jwilber :: PR: #14027
- Fixing file path suffix by @gautham-kollu :: PR: #14179
- Update AVLM finetune example for vanilla fine-tuning by @huvunvidia :: PR: #14232
- [finetune] Add dataset_kwargs to prepare packed sequence data by @jiajunly :: PR: #14169
- Allow exception in hf ckpt load attempt before fallback to standard l… by @trvachov :: PR: #14214
- Load master weights from checkpoint by @kunlunl :: PR: #14072
- Add deploy lora adapter portion by @ruchaa-apte :: PR: #14255
- fix speechlm lhotse loading nemo_tarred by @stevehuang52 :: PR: #14314
- Update changelog for
r2.4.0
by @github-actions[bot] :: PR: #14334 - Flaky test timing out: @pytest.mark.pleasefixme by @pablo-garay :: PR: #14351
- Support dump perf recipe diff from base recipe by @guyueh1 :: PR: #14206
- Bugfix degenerate bases evo2 dataset by @jstjohn :: PR: #14359
- Hyena support for flash decode API by @jstjohn :: PR: #14315
- Fix Gemma2/3 & Llava (Next) & Llama4 conversion issue with latest transformers by @suiyoubi :: PR: #14367
- fix: reduce the excessive test time of test_msdd_diar_inference by @tango4j :: PR: #14366
- SpeechLM2: S2S->S2T data reader, excessive padding fixes by @pzelasko :: PR: #14124
- chore: Release 2.5.0rc0 by @ko3n1g :: PR: #14389
- Add pyxis flag for container writable. by @sudostock :: PR: #14395
- [MoE] Partial Cudagraph support for MoE by @gdengk :: PR: #14362
- Revert "[MoE] Partial Cudagraph support for MoE (#14362)" by @chtruong814 :: PR: #14402
- Update AVLM recipes for NeMo-CI runs by @huvunvidia :: PR: #14397
- Remove nemo1 multimodal and vision by @yaoyu-33 :: PR: #14095
- Fix LazyNeMoIterator supervision for multi-channel cuts by @anteju :: PR: #14409
- Bump Mcore to 7f7439f by @chtruong814 :: PR: #14373
- Use cuhyena rearrange when available. by @moradza :: PR: #14383
- Fix model training/eval state after PTL validation loop by @paul-gibbons :: PR: #14152
- Add deprecation notice to eval code by @athitten :: PR: #14316
- Streaming Sortformer release PR04: Adding functional tests for streaming sortformer by @tango4j :: PR: #14435
- QWEN2.5-VL 7B Performance Recipe by @tomlifu :: PR: #14401
- Discount FLOPs in dot-product att by @erhoo82 :: PR: #14424
- Bump to pytorch 25.06 and newer TE commit by @chtruong814 :: PR: #14423
- Enable precision aware optimizer for dsv3 by @guyueh1 :: PR: #14444
- Make VBoost activation conditional by @bdubauski :: PR: #14458
- cuHyena FFTConv support for Hyena Long Implicit (LI) Layer by @far...
Assets 2
NVIDIA Neural Modules 2.4.1
2919fed
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
Detailed Changelogs:
Uncategorized:
Changelog
- Update package_info.py by @ko3n1g :: PR: #14400
- Patch to address issue 14392 by @youngeunkwon0405 :: PR: #14398
- Cherry pick
Fix callbacks in DSV3 script (14350)
intor2.4.0
by @chtruong814 :: PR: #14370 - Cherry pick
Change Llama Embedding Tutorial to use SFT by default (14231)
intor2.4.0
by @chtruong814 :: PR: #14303 - Cherrypick
calculate_per_token_loss requirement for context parallel
(#14065) (#14282) intor2.4.0
by @chtruong814 :: PR: #14448 - Pin nvidia-lm-eval to 25.6.1 by @chtruong814 :: PR: #14470
Assets 2
NVIDIA Neural Modules 2.3.3
26be5f5
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
- This release addresses known security issues. For the latest NVIDIA Vulnerability Disclosure Information visit https://www.nvidia.com/en-us/security/, for acknowledgement please reach out to the NVIDIA PSIRT team at PSIRT@nvidia.com
- Pin nvidia-lm-eval to 25.5
Assets 2
25.09-alpha.rc1
Compare
[Flux] Add cuda_graph_scope and cache images ids for full iteration c…
Assets 2
NVIDIA Neural Modules 2.5.0rc0
620d2ba
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
Prerelease: NVIDIA Neural Modules 2.5.0rc0 (2025-08-03)
Assets 2
NVIDIA Neural Modules 2.4.0
2381f42
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
Highlights
- Collections:
- Speech
- Batched beam search for transducers (RNN-T and TDT)
- RNNT/TDT buffered/streaming inference + batched decoding support in cache-aware
- add support for CTC batched beam search with GPU-LM
- Key fixes
- Punctuation Marks in Timestamps
- Fix timestamps when cuda graphs enabled
- Fix masking of <pad> tokens in AED inference
- TDT streaming inference fix
- Batched beam search for transducers (RNN-T and TDT)
- LLM
- Qwen 3 235B-A22B Perf Optimized
- DeepSeek V3 Perf Optimized
- Gemma3 support from Google
- Embedding and Reranker models
- MM
- Llama 4
- AVLM
- Speech
- Training performance (speed)
- NVL sharp + IB sharp for DP/FSDP-communications on H100 and B200
- MXFP8 with TP communication overlap
- MXFP8 with reduced memory allocation
- FP8 sub-channel recipe (128x128 for weight and 1x128 for activation)
- cudnn fused attention for MLA (both Hopper and Blackwell)
- Advanced custom asymmetric pipelining (for MTP, loss func, and embd)
- BF16 optimizer for model memory saving
- CUDA graph fix for fine-tuning benchmarks
- CUDA graph support for LLAMA4
Detailed Changelogs
ASR
Changelog
- ci: Fix ASR container by @ko3n1g :: PR: #13288
- Set L2_Segmentation_Tool_Parallel_ctc_segmentation test to be optional by @chtruong814 :: PR: #13296
- Revert "WebDataset URL refactoring" by @ko3n1g :: PR: #13421
- Update flagged docs links by @erastorgueva-nv :: PR: #13391
- [Docs] Fix incorrectly formatted reference tags by @erastorgueva-nv :: PR: #13445
- Update CP by @pablo-garay :: PR: #13532
- Tdt buffered inference fix by @hainan-xv :: PR: #13500
- Fix transcribe when nbest hypotheses are returned by @lilithgrigoryan :: PR: #13540
- Set ASR test to be optional by @chtruong814 :: PR: #13633
- Enabling chunked inference for AED models in asr_evaluator by @melllinia :: PR: #13674
- Ko3n1g/chore/asr only by @ko3n1g :: PR: #13704
- decompressing joblib file before checking it by @Ssofja :: PR: #13732
- Revert "decompressing joblib file before checking it (#13732)" by @chtruong814 :: PR: #13791
- Punctuation Marks in Timestamps by @monica-sekoyan :: PR: #13353
- AIStore with Webdataset by @monica-sekoyan :: PR: #13604
- Update to add default for dataclass variables by @nithinraok :: PR: #13814
- This PR addresses to known security issues by @Ssofja :: PR: #13804
- remove model_stride var by @nithinraok :: PR: #13867
- add CTC batched beam search by @lilithgrigoryan :: PR: #13337
- Clean up streaming ASR script and tests by @artbataev :: PR: #13894
- add NGPU-LM fusion during CTC greedy by @lilithgrigoryan :: PR: #13917
TTS
Changelog
- Revert "WebDataset URL refactoring" by @ko3n1g :: PR: #13421
- Update flagged docs links by @erastorgueva-nv :: PR: #13391
- [Docs] Fix incorrectly formatted reference tags by @erastorgueva-nv :: PR: #13445
- Update CP by @pablo-garay :: PR: #13532
- fix: vpp stage refactoring to match mcore by @ZhiyuLi-Nvidia :: PR: #13673
- AIStore with Webdataset by @monica-sekoyan :: PR: #13604
NLP / NMT
Changelog
- Migrate Hyena to Megatron inference_context. by @cspades :: PR: #13436
- Update CP by @pablo-garay :: PR: #13532
- fix broken links by @dimapihtar :: PR: #13544
- Add nlp import checks by @thomasdhc :: PR: #13563
- PTQ model support, quant_cfg, and documentation updates by @janekl :: PR: #13519
- feat - GPTSFTChatDataset alignment with OpenAI Messages, compatibility with packed sequences by @soluwalana :: PR: #13367
- fix: vpp stage refactoring to match mcore by @ZhiyuLi-Nvidia :: PR: #13673
- Fix resume with MegatronPretrainingBatchSampler by @ashors1 :: PR: #13565
- Punctuation Marks in Timestamps by @monica-sekoyan :: PR: #13353
- Revert
Adding more doc-strings to megatron_parallel.py #12767
by @ko3n1g :: PR: #13824 - reasoning model evaluation mmlu gpqa by @ruchaa-apte :: PR: #13880
- Remove unused DynamicRetrievalServer and Bert dataset loader classes by @dimapihtar :: PR: #14209
- Huvu/avlm qafix cherrypick from by @huvunvidia :: PR: #14253
Export
Changelog
- Improve Nemo2Exporter for Models Using Custom Modelling Files on HF by @suiyoubi :: PR: #13400
- Adding more export tests by @oyilmaz-nvidia :: PR: #13410
- Add Warning to Export when output_path exists by @suiyoubi :: PR: #13465
- Move libsox-fmt-all from Dockerfile.ci.export_deploy to Dockerfile.ci by @chtruong814 :: PR: #13452
- ci: Remove trt-llm breakpoint by @ko3n1g :: PR: #13499
- Add Qwen2VL export_ckpt by @AtsunoriFujita :: PR: #13398
- Add MLlama export_ckpt by @AtsunoriFujita :: PR: #13346
- Update vLLMExporter to use vLLM V1 by @janekl :: PR: #13498
- Add vLLM Mixtral and TRT-LLM qnemo export tests (plus a couple of bugfixes) by @janekl :: PR: #13697
- Fix Qwen3 export + misc by @cuichenx :: PR: #13679
- Extra int cast for successful tracing during ONNX export by @janekl :: PR: #13782
- FP8 lora export by @cuichenx :: PR: #13748
- Add PEFT export check by @cuichenx :: PR: #13835
- Update llm api import_ckpt/export_ckpt docstring by @meatybobby :: PR: #13714
- Use modelopt export and disable dataset calibration for weight only PTQ by @jenchen13 :: PR: #13756
Bugfixes
Uncategorized
Changelog
- build: various bumps by @ko3n1g :: PR: #13285
- ci: Fixes to selective triggering by @ko3n1g :: PR: #13287
- ci: Set timeout by @ko3n1g :: PR: #13294
- Set L2_NeMo_2_T5_Pretraining test as optional by @chtruong814 :: PR: #13282
- Add test environment approval step for CI by @chtruong814 :: PR: #13297
- update num nodes in deepseek v3 finetune recipe by @cuichenx :: PR: #13314
- ci: Increase cache pool by @ko3n1g :: PR: #13306
- Rename adam_with_cosine_annealing as adam since cosin LR is not setup by @ShriyaRishab :: PR: #13315
- ci: Update test queue bot to not assume a workflow is launched from a PR by @chtruong814 :: PR: #13318
- Fix TE pytorch attention doc link by @thomasdhc :: PR: #13327
- ci: Add all recent buildcaches to update-buildcache job by @ko3n1g :: PR: #13289
- Fix neva notebook by @yaoyu-33 :: PR: #13334
- Fix transformer offline for CI/CD llama4 tests by @yaoyu-33 :: PR: #13339
- [automodel] convert lm head to full tensor before passing to lce by @yuanzhedong :: PR: #13319
- ci: No dups in queue by @ko3n1g :: PR: #13352
- ci(hotfix): VLM CPU unit tests by @ko3n1g :: PR: #13348
- vLLM==0.8.5 update by @janekl :: PR: #13350
- ci: Allow bypassing approval by @ko3n1g :: PR: #13365
- Avoid the need to specify optional attributes for lhotse/nemo reader functions by @pzelasko :: PR: #13307
- ci: Fix selective-triggering for non-PR events by @ko3n1g :: PR: #13374
- ci: Revert
no-concurrency-group-on-main
by @ko3n1g :: PR: #13375 - ci: Improve no-fail-fast mechanism by @ko3n1g :: PR: #13370
- 2d buckets estimation fix by @monica-sekoyan :: PR: #13377
- ci: Fix scheduled runs by @ko3n1g :: PR: #13378
- Ko3n1g/ci/fix nightly runs by @ko3n1g :: PR: #13382
- [automodel] fix none issue in dataset for qwen model by @yuanzhedong :: PR: #13311
- update table by @akoumpa :: PR: #13397
- Improve test coverage for audio modules by @anteju :: PR: #13333
- Disable failing maxine loss test by @anteju :: PR: #13361
- Ko3n1g/ci/no notification on cancel by @ko3n1g :: PR: #13403
- document fp8_recipe by @akoumpa :: PR: #13405
- Weekly bump main by @ko3n1g :: PR: #13408
- Handle boolean args for performance scripts and log received config by @guyueh1 :: PR: #13291
- [automodel] add FirstRankPerNode by @akoumpa :: PR: #13373
- tests: Disable flaky audio test by @ko3n1g :: PR: #13429
- ci: Disable flaky audio test by @ko3n1g :: PR: #13435
- Fix loss compute and reduction by @xrennvidia :: PR: #13295
- ci: Skip link check on github links by @chtruong814 :: PR: #13425
- Add NCCL cfg interface to perf scripts by @erhoo82 :: PR: #13407
- ci: Success only if
Run CICD
label attached by @ko3n1g :: PR: #13430 - ci: Add tests to selective triggering by @ko3n1g :: PR: #13404
- ci: Remove jq by @ko3n1g :: PR: #13440
- ci: Fix deps tree for tests by @ko3n1g :: PR: #13443
- Ko3n1g/ci/fix dependency tree by @ko3n1g :: PR: #13448
- Adding additional unit tests for the deploy module by @pthombre :: PR: #13411
- [Audio] fix a flaky test (and also make some tests run faster) by @racoiaws :: PR: #13439
- [automodel] ignore tail padding in TPS calculation by @akoumpa :: PR: #13329
- Ko3n1g/ci/selective triggering 3 by @ko3n1g :: PR: #13460
- ci: Disable broken neva tests by @ko3n1g :: PR: #13461
- fix speechlm data module by @stevehuang52 :: PR: #13362
- ci: Enter queue only with passing linting by @ko3n1g :: PR: #13462
- Adding tests for Schroedinger Bridge model by @nasretdinovr :: PR: #13401
- add more detailed description by @dimapihtar :: PR: #13464
- [Audio] tests for score-based and flow matching enhancement models by @racoiaws :: PR: #13406
- Use expandable cuda memory segmentation by @erhoo82 :: PR: #13418
- Fix llava tokenizer caused nan issue by @yaoyu-33 :: PR: #13466
- Remove cuda method from ModelPT by @erastorgueva-nv :: PR: #13394
- Fix BNR 2 unit test + input, case where input length was not specified by @nitin9252 :: PR: #13467
- ci: Do not run any tests if no match is found by @ko3n1g :: PR: #13479
- Ko3n1g/ci/selective triggering 4 by @ko3n1g :: PR: #13489
- Fix typo in the performance script by @youngeunkwon0405 :: PR: #13487
- ci: No runs on main by @ko3n1g :: PR: #13490
- ci: Upload on schedule by @ko3n1g :: PR: #13491
- ci: Run selective triggering on dockerfiles and dependencies by @ko3n1g :: PR: #13493
- [automodel] fallback FP8 + LCE -> FP8 + CE by @akoumpa :: PR: #13349
- Update changelog for
r2.3.0
by @github-actions[bot] :: PR: #13501 - Update 2.3.0...
Assets 2
8 people reacted
NVIDIA Neural Modules 2.3.2
f98ef1d
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
This release addresses known security issues. For the latest NVIDIA Vulnerability Disclosure Information visit https://www.nvidia.com/en-us/security/, for acknowledgement please reach out to the NVIDIA PSIRT team at PSIRT@nvidia.com
Assets 2
NVIDIA Neural Modules 2.4.0rc2
7ac5a8e
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
Prerelease: NVIDIA Neural Modules 2.4.0rc2 (2025-07-09)
Assets 2
NVIDIA Neural Modules 2.4.0rc1
4c6fb0c
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
Prerelease: NVIDIA Neural Modules 2.4.0rc1 (2025-07-02)
Assets 2
Previous Next
You can’t perform that action at this time.