CARVIEW |
Kubernetes Engine (OKE)
Simplify operations of enterprise-grade Kubernetes at scale. Easily deploy and manage resource-intensive workloads such as AI with automatic scaling, patching, and upgrades.
Oracle Cloud Infrastructure: A price-performance leader for Kubernetes
CIO magazine recognizes OCI for its expertise in delivering cutting-edge Kubernetes solutions, supporting scalable and efficient application development.
Why Choose OKE?
-
Price-Performance
OKE is the lowest cost Kubernetes service amongst all hyperscalers, especially for serverless.
-
Autoscaling
OKE automatically adjusts compute resources based on demand, which can reduce your costs.
-
Efficiency
GPUs can be scarce, but OKE job scheduling makes it easy to maximize resource utilization.
-
Portability
OKE is consistent across clouds and on-premises, enabling portability and avoiding vendor lock-in.
-
Simplicity
OKE reduces the time and cost needed to manage the complexities of Kubernetes infrastructure.
-
Reliability
Automatic upgrades and security patching boost reliability for the control plane and worker nodes.
-
Resiliency
Fully automated, native cross-region recovery is available using OCI Full Stack Disaster Recovery.
OKE use cases
OKE powers OCI AI services
Kubernetes is the go-to platform to deploy AI workloads. OKE powers Oracle Cloud Infrastructure (OCI) AI services.
AI model building
– The initial build stage of an AI project involves defining the problem and preparing data to create models.
– Kubernetes clusters can significantly improve efficiency by granting shared access to expensive and often limited GPU resources while providing secure and centrally managed environments.
– Kubeflow, a Kubernetes-related open source project, provides a comprehensive framework designed to streamline the building, training, and deployment of models.
OKE for AI model building
OKE is built on top of OCI, offering a complete stack of high performance infrastructure designed for AI/ML workloads such as:
– The full range of NVIDIA GPUs including H100, A100, A10, etc.
– Ultrafast RDMA networks
Using OKE self-managed nodes, you can run AI/ML building workloads on your Kubernetes clusters.
OKE powers OCI AI services
Kubernetes is the go-to platform to deploy AI workloads. OKE powers OCI AI services.
AI model training
– In model training, data scientists select an algorithm and initiate training jobs using prepared data. This stage requires sophisticated scheduling systems to handle the jobs efficiently.
– Kubernetes projects such as Volcano and Kueue help handle such requirements and make efficient use of compute resources.
– Large-scale distributed training requires low-latency internode communications in the cluster. This is where a specialized ultrafast network with remote direct memory access (RDMA) is needed. It enables data to be moved directly to or from an application’s memory, bypassing the CPU to reduce latency.
OKE for AI model training
OKE is built on top of OCI, offering a complete stack of high performance infrastructure designed for AI/ML workloads such as:
– The full range of NVIDIA GPUs including H100, A100, A10, etc.
– Low-latency, ultra-high performance RDMA networks
Using OKE self-managed nodes, you can run AI/ML training on your Kubernetes clusters.
OKE powers OCI AI services
Kubernetes is the go-to platform to deploy AI workloads. OKE powers OCI AI services.
AI model inferencing (serving)
– AI model inferencing is where Kubernetes really shines. Kubernetes can automatically scale the number of inference pods up or down based on demand, ensuring efficient use of resources.
– Kubernetes provides sophisticated resource management, including the ability to specify CPU and memory limits for containers.
OKE for AI model inference
OKE is designed with resilience at its core, leveraging Kubernetes’ built-in pod autoscaling to scale worker nodes based on usage. Worker nodes can be distributed across multiple fault and/or availability domains for high availability.
OKE virtual nodes provide a serverless Kubernetes experience. They only need to scale at the pod level, without ever scaling worker nodes. This allows for quicker scaling and more economical management since service fees are based solely on the pods in use.
Virtual nodes are well-suited for inference workloads and can use Arm processors, which are becoming a much more attractive option for AI inference—especially when GPUs are in short supply.
Existing applications can benefit by migrating to OCI and OKE
OKE offers lower total cost of ownership and improved time to market.
OKE simplifies operations at scale in the following ways:
- Lift and shift; there’s no need to rearchitect
- Reduce operations burden with automation
- Save time on infrastructure management
- Increase resource utilization and efficiency
- Improve agility, flexibility, uptime, and resilience
- Reduce compliance risk and enhance security
Microservices offer many advantages over monolithic applications
Future-proof your applications with an OKE-centric microservices architecture.
- Architecture modernization
- Faster pace of innovation
- Deployment automation
- Parallel development
- Easier scalability
- Higher reliability
- More flexibility
- Greater agility
“Many OCI AI services run on OCI Kubernetes Engine (OKE), Oracle’s managed Kubernetes service. In fact, our engineering team experienced a 10X performance improvement with OCI Vision just by switching from an earlier platform to OKE. It’s that good.”
VP of OCI AI Services, Oracle Cloud Infrastructure
Customers innovating with cloud native services on OCI
Get started with Kubernetes Engine
-
Deploy a simple containerized app using OKE managed nodes
Deploy simple microservices packaged as Docker containers and communicate via a common API.
-
Deploy a Kubernetes cluster with virtual nodes
Discover best practices for deploying a serverless virtual node pool using the provided Terraform automation and reference architecture.
-
Discover patterns to optimize your Kubernetes resources
Find out how Tryg Insurance reduced their costs by 50% via dynamic rightsizing.
Announcing Fully Automated Disaster Recovery for OCI Kubernetes Engine using OCI Full Stack DR
Gregory King, Senior Principal Product Manager
Oracle Cloud Infrastructure (OCI) Full Stack Disaster Recovery (Full Stack DR) announces native support for OCI Kubernetes Engine (OKE). OKE clusters are now a selectable OCI resource in Full Stack DR just like virtual machines, storage, load balancers, and Oracle databases. This means we know exactly how to validate, failover, switchover, and test your ability to recover OKE, infrastructure, and databases without your IT staff writing one line of code or step-by-step instructions in a spreadsheet or text file.
Read the complete postFeatured Kubernetes blogs
- March 27, 2025 Announcing IPv6 Support for OCI Kubernetes Engine (OKE)
- March 27, 2025 OCI Kubernetes Engine (OKE) Now Scales to 5000 Worker Nodes
- February 21, 2025 Streamlining GPU Management on OCI Kubernetes Engine (OKE) with the Nvidia GPU Operator
- October 16, 2024 Announcing OpenId Connect in OCI Kubernetes Engine
- October 7, 2024 Simplifying GPU monitoring in OCI Kubernetes Engine (OKE) with Node Manager
Kubernetes resources


Workshops
What is Kubernetes?
Kubernetes is an open source platform for managing and scaling clusters of containerized applications and services.
More training
Enhance your cloud native skills
Videos
What is Kubernetes?
Kubernetes is an open source platform for managing and scaling clusters of containerized applications and services.
Related Kubernetes products
Full Stack DR
Fully automated disaster recovery for Oracle Kubernetes Engine
DevOps CI/CD
Automate application delivery across build, test, and deployments
Get started with OKE
Oracle Cloud Free Tier
Get 30 days of access to CI/CD tools, managed Terraform, telemetry, and more.
Architecture Center
Explore deployable reference architectures and solutions playbooks.
Oracle Cloud Native services
Empower app development with Kubernetes, Docker, serverless, APIs, and more.
Contact us
Reach our associates for sales, support, and other questions.