CARVIEW |
Select Language
HTTP/2 302
date: Tue, 22 Jul 2025 03:34:04 GMT
content-type: text/html; charset=utf-8
content-length: 0
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
location: https://github.com/ai-dynamo/dynamo/releases/tag/v0.3.2
cache-control: no-cache
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: no-referrer-when-downgrade
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
set-cookie: _gh_sess=FS%2FNLhzCeia3v4GIVWAMQW9q9IPN7BklIJlmaUucs0EsmMoXrOxZQNQdYfdfpq4YhJdVL25AFBm0gLmqyY2aNpGxnTLj6aI0slUnrnCgcKhIDL31RvWXwRkLCj6MyjNggTi05kXZUNCAWzESmgyfKxDtCG0ev4XxBGvvyjHZxcvaQNg%2BSSAeTBI0o5oCLd5a%2BtG0l9pmi9NJVIF6Pnh%2FkMTj7u5JesSXYo978CkQS1R65ar5tvHVZ8bumVj98qFdGBCirTtWTulAzK6qeIPqIQ%3D%3D--Ket%2B3KMCVs0H8CRP--Gz7Oo78R0cV5L6lK3uGztg%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.1347958649.1753155244; Path=/; Domain=github.com; Expires=Wed, 22 Jul 2026 03:34:04 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Wed, 22 Jul 2026 03:34:04 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: C9F0:25B3B0:DC251:13C433:687F06AC
HTTP/2 200
date: Tue, 22 Jul 2025 03:34:05 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
etag: W/"530339fae46abed3e7cfa666361c5717"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: no-referrer-when-downgrade
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
x-github-request-id: C9F0:25B3B0:DC294:13C465:687F06AC
Release Dynamo Release v0.3.2 · ai-dynamo/dynamo · GitHub
Loading
Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 491
Compare
50f3636
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Dynamo is a high-performance, low-latency inference framework designed to serve generative AI models—across any framework, architecture, or deployment scale. It's an open source project under the Apache 2.0 license. Dynamo is available for installation via pip wheels and containers from NVIDIA NGC.
As a vendor-neutral serving framework, Dynamo supports multiple large language model (LLM) inference engines to varying degrees:
- NVIDIA TensorRT-LLM
- vLLM
- SGLang
Major Features and Improvements
Engine Support and Routing
- Example standalone router for use outside of dynamo (#1409).
- The new SLA-based planner dynamically manages resource allocation based on service-level objectives (#1420).
- Data-parallel vLLM worker setups are now supported (#1513).
- SGLang support was extended for DeepEP deployments (#1120).
- Clean shutdown is now available for vllm_v1 and SGLang engines (#1562, #1764).
- Experimental support for WideEP with EPLB aggregation and disaggregation is now available for TRTLLM (#1652, #1690).
- Approximate KV cache residency and predicted active KV blocks for improved routing efficiency (#1636, #1638, #1731).
Observability and Metrics
- Native DCGM and Prometheus integration enables hardware metrics collection and export. Optional Grafana dashboards are provided (#1488, #1701, #1788).
- New Grafana dashboards offer composite software and hardware system visibility (#1788).
- Batch
/completions
endpoint and speculative decoding metrics are now supported for vLLM (#1626, #1549).
Deployment, Kubernetes, and CLI
- The Kubernetes operator now supports custom entrypoints, command overrides, and simplified graph deployments (#1396, #1708, #1877, #1893).
- Example manifests for multimodal and minimal deployments were added (#1836, #1872).
- Graph Helm chart logic, resource requests, and health probes were improved (#1877, #1888).
- Two new Helm charts are introduced in this release: dynamo-platform, and dynamo-crds, enabling modular and robust Kubernetes deployments for a variety of topologies and operational requirements.
- The
dynamo-run
command line interface now supports the--version
flag and improved error handling and validation (#1596, #1674, #1623). - Docker and Kubernetes deployment workflows were streamlined. Helm charts and container images were improved (#1742, #1796, #1840, #1841).
Developer Experience
- Embedding request handling was improved with frontend tokenization (#1494).
- OpenAI API request validation is now available (#1674).
- Batch embedding and parallel tokenization improve efficiency for batch inference and embedding (#1657).
- The
/responses
endpoint and additional API features were added (#1694).
Bug Fixes
- Issues related to GPU resource specifications in deployments, container builds, and runtime were fixed (#1826, #1792, #1546).
- Helm chart logic, resource requests, and health probes were corrected (#1877, #1893).
- Error handling and model loading were improved for multimodal and distributed deployments (#1545).
- Metrics publishing and logging were fixed for vLLM, SGLang, and OpenAI endpoints (#1864, #1649, #1639).
- Process cleanup issues were resolved in tests (#1801).
Documentation
- Documentation updates include new guides for Ray setup, architecture diagrams, and deployment modes (#1947, #1697).
- Benchmarking, troubleshooting, and advanced usage scenario documentation was enhanced.
- Notes were added to deprecate outdated connectors (#1964, #1959).
Build, CI, and Test
- Dependency upgrades include protobuf, nats, and etcd (#1876, #1744).
- CI coverage now includes GPU-based and multi-engine tests.
- Container builds now use distroless images for improved security and efficiency (#1570, #1569).
- Fault tolerance tests #1444
Known Issues
- KVBM is supported only with Python 3.12.
Release Assets
Python Wheels:
Rust Crates:
Containers:
- nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.3.2
- nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:0.3.2
- nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.3.2
- nvcr.io/nvidia/ai-dynamo/kubernetes-operator:0.3.2
Helm Charts:
Contributors
Thank you to all contributors for this release. For a full list, refer to the changelog.
Assets 3
2 people reacted
You can’t perform that action at this time.