Last Updated: March 2026
Table of Contents
1. Introduction
If you’ve been running Kubernetes in production for a while, you already know that every new release brings a mix of excitement and caution. Kubernetes v1.36 is no different — it’s a meaningful release that addresses real pain points around scalability, scheduling efficiency, network policy management, and cluster security.
This latest version of Kubernetes continues the project’s trajectory toward making large-scale cluster operations more predictable and less error-prone. Whether you’re managing a handful of nodes or running thousands of pods across multiple availability zones, v1.36 has improvements that will directly affect your day-to-day operations.
In this guide, we break down every major feature, explain why it matters in practice, walk through the upgrade steps, and flag the deprecations and removals you need to know about before you do anything in production. This is written for DevOps engineers and platform teams who need the full picture — not just the changelog.
💡 Who is this guide for? DevOps engineers, SREs, platform engineers, and anyone responsible for managing Kubernetes clusters in production or staging environments.
2. Kubernetes v1.36 Release Overview
Kubernetes v1.36 lands as a stable, production-ready release. The theme across this release is operational maturity — features that were beta in previous cycles are graduating to stable, and a set of long-discussed improvements around scheduling, networking, and storage are finally landing in a shape teams can rely on.
| Category | Details |
|---|---|
| Release | Kubernetes v1.36 |
| Release Type | Stable / GA |
| Primary Focus Areas | Scalability, Security, Scheduling, Networking |
| New Features (GA) | 6 features graduating to stable |
| New Features (Beta) | 5 new beta-stage features |
| New Features (Alpha) | 4 new alpha features |
| Deprecated APIs | 3 APIs deprecated |
| Removed Features | 2 features fully removed |
| Supported Upgrade Path | From v1.34.x or v1.35.x |
One thing worth noting is the continued investment in the scheduler and kubelet. Both components have received meaningful improvements that reduce tail latency at scale — something that’s hard to capture in a changelog but very noticeable in production clusters with high pod churn.
3. Top New Features in Kubernetes v1.36
Let’s go through each major feature one by one. For each, I’ll explain what it does, why it matters, and where relevant, show you what it looks like in practice.
Feature 1: Dynamic Resource Allocation (DRA) Graduates to Stable
What it is: Dynamic Resource Allocation (DRA) provides a new way to request and share resources — like GPUs, FPGAs, and other specialized hardware — between pods. Unlike the older device plugin model, DRA gives vendors a richer API to describe what their hardware can do and how it should be partitioned.
Why it matters: If you’re running AI/ML workloads or any workload that depends on hardware accelerators, the old device plugin model could be rigid and wasteful. DRA allows multiple pods to share a single GPU in a controlled way, or claim a specific slice of a device. That translates directly into better hardware utilization and lower costs.
Example use case: A machine learning team running inference jobs no longer needs to dedicate an entire GPU to a single pod. With DRA, they can request a GPU partition, allowing multiple inference pods to coexist on the same GPU without interfering with each other.
# Example ResourceClaim for DRA
apiVersion: resource.k8s.io/v1alpha2
kind: ResourceClaim
metadata:
name: gpu-slice-claim
spec:
resourceClassName: gpu.example.com
parametersRef:
name: gpu-partition-small
Feature 2: Improved Pod Scheduling with QueueingHint
What it is: The scheduler in v1.36 introduces a stable implementation of QueueingHint, a mechanism that tells the scheduler when a previously unschedulable pod might now be schedulable, based on cluster events. Instead of re-evaluating all pending pods on every event, the scheduler only rechecks the pods that relevant events might unblock.
Why it matters: At scale — think thousands of pending pods — the old approach caused significant scheduling latency and CPU spikes. QueueingHint reduces unnecessary scheduling cycles and makes the scheduler much more responsive under load. Teams running large batch processing workloads or clusters with high pod churn will see a real improvement.
Production impact: In internal benchmarks, clusters with 5,000+ pending pods saw scheduling throughput improve by up to 30% with QueueingHint enabled, with a notable reduction in scheduler CPU usage.
Feature 3: Network Policy v2 — Enhanced Expressiveness
What it is: Kubernetes v1.36 advances support for more expressive NetworkPolicy rules, including the ability to define policies based on CIDR ranges with exceptions, port ranges, and more granular namespace selectors. This was a longstanding gap in the original NetworkPolicy API that forced teams to create multiple overlapping policies.
Why it matters: Writing network policies has always been one of the more painful parts of Kubernetes security. The old model required workarounds for seemingly simple requirements like “allow all traffic from this CIDR except these specific IPs.” The new expressiveness eliminates a large class of those workarounds.
# Example: Allow a CIDR range with an exception
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-external-with-exception
spec:
podSelector:
matchLabels:
app: api-server
ingress:
- from:
- ipBlock:
cidr: 10.0.0.0/8
except:
- 10.5.0.0/16 # Block a specific internal subnet
Feature 4: Sidecar Containers Graduate to Stable
What it is: Sidecar containers — defined using the initContainers field with restartPolicy: Always — are now stable in v1.36. This feature allows you to define containers that start before your main app container and remain running for the lifetime of the pod, without being treated as init containers that block startup.
Why it matters: Patterns like log forwarding agents, service mesh proxies, and secret injection sidecars have always been a bit awkward to manage in Kubernetes. The lifecycle ordering was unpredictable, and there was no clean way to guarantee your sidecar stayed alive. The stable sidecar API fixes this properly.
# Sidecar container definition (now stable)
initContainers:
- name: log-forwarder
image: fluent/fluent-bit:2.1
restartPolicy: Always # This is what makes it a sidecar
volumeMounts:
- name: app-logs
mountPath: /var/log/app
Feature 5: Node Log Query via API
What it is: Kubernetes v1.36 introduces a stable Node Log Query API, which lets you query logs from node-level system services (like kubelet, containerd, and journald) directly through the Kubernetes API — without needing SSH access to the node.
Why it matters: Debugging node-level issues in a locked-down environment has always meant either granting SSH access to operators (a security risk) or building custom tooling. The log query API makes this a first-class operation, which is especially valuable in managed Kubernetes environments where direct node access is restricted.
# Query kubelet logs from a node using kubectl kubectl get --raw "/api/v1/nodes/node-1/proxy/logs/?query=kubelet&sinceTime=1h"
Feature 6: Persistent Volume Health Monitor (Beta)
What it is: The volume health monitoring feature, moving to beta in v1.36, allows storage plugins to report the health of persistent volumes back to the Kubernetes control plane. Kubelet can then surface this information as pod events and conditions.
Why it matters: Storage failures are often silent in Kubernetes. A volume can be degraded without the pod reporting any obvious error until it crashes. Health monitoring closes that gap, giving operations teams early warning of storage degradation before it causes an outage.
4. Performance & Infrastructure Improvements
Scheduler Performance
Beyond QueueingHint, the scheduler in v1.36 received several internal optimizations. The preemption logic was reworked to avoid unnecessary API calls, and the gang scheduling path was made more efficient for batch jobs. Expect lower scheduling latency on clusters with frequent pod creation and deletion.
Kubelet Startup Performance
Node readiness time after a restart has improved in v1.36. The kubelet now parallelizes some of the volume setup and pod status sync operations that previously ran sequentially. In practice, this means nodes recover faster after a restart — useful for rolling updates and cluster auto-scaling scenarios.
etcd Integration Improvements
The API server’s interaction with etcd has been optimized to reduce watch event overhead. Large clusters with high object churn (lots of pods being created and deleted rapidly) will see more consistent API server response times.
Networking
The kube-proxy iptables backend received fixes for rule ordering edge cases that could cause intermittent connectivity failures under certain NAT configurations. Teams that had worked around this with custom iptables rules should review their configurations after upgrading.
💡 Tip: If you run a large cluster with frequent pod churn, enable the SchedulerQueueingHints feature gate if it isn’t on by default in your distribution. The scheduling throughput gains are significant.
5. Deprecated & Removed Features
This is the section most engineers skip — and then regret when something breaks after upgrade. Read this carefully before you do anything in production.
Deprecated in Kubernetes v1.36
| Feature / API | Status | Replacement | Planned Removal |
|---|---|---|---|
flowcontrol.apiserver.k8s.io/v1beta2 | Deprecated | flowcontrol.apiserver.k8s.io/v1 | v1.38 |
| In-tree vSphere volume plugin | Deprecated | vSphere CSI Driver | v1.37 |
--pod-max-pids kubelet flag | Deprecated | PodPidsLimit via kubelet config | v1.38 |
Removed in Kubernetes v1.36
| Feature / API | Status | Replacement |
|---|---|---|
CSIStorageCapacity v1beta1 | Removed | CSIStorageCapacity v1 |
| Legacy Service Account Token auto-mounting (without explicit opt-in) | Removed | Bound service account tokens |
⚠ Action Required: If you are still using CSIStorageCapacity v1beta1 in any manifests or Helm charts, update them before upgrading to v1.36. The API is completely gone — requests will fail with a 404.
⚠ vSphere Users: Start planning your migration to the vSphere CSI driver now. The in-tree plugin deprecation in v1.36 means it will be removed in v1.37. This migration can take time depending on your storage setup.
6. Kubernetes v1.36 Upgrade Guide
Upgrading Kubernetes in production is never trivial. Here’s a practical guide that covers the key steps and precautions for upgrading to v1.36.
Prerequisites
- You should be running Kubernetes v1.34.x or v1.35.x before upgrading. Skipping minor versions is not supported.
- Ensure all nodes are healthy and no DaemonSet pods are in a crash loop.
- Review the deprecation table above and remove or update any usage of deprecated/removed APIs in your manifests, Helm charts, and Operators.
- Check your CNI plugin, CSI drivers, and admission webhooks for v1.36 compatibility with their respective vendor release notes.
- Back up etcd before doing anything else.
Step-by-Step Upgrade (kubeadm)
- Back up etcd ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db \ –endpoints=https://127.0.0.1:2379 \ –cacert=/etc/kubernetes/pki/etcd/ca.crt \ –cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \ –key=/etc/kubernetes/pki/etcd/healthcheck-client.key
- Check available upgrade path kubeadm upgrade plan
- Upgrade the control plane apt-mark unhold kubeadm && \ apt-get update && apt-get install -y kubeadm=1.36.x-00 && \ apt-mark hold kubeadm kubeadm upgrade apply v1.36.x
- Drain and upgrade worker nodes one at a time kubectl drain node-1 –ignore-daemonsets –delete-emptydir-data # On the node: apt-mark unhold kubelet kubectl && \ apt-get install -y kubelet=1.36.x-00 kubectl=1.36.x-00 && \ apt-mark hold kubelet kubectl systemctl daemon-reload && systemctl restart kubelet
- Uncordon the node and verify kubectl uncordon node-1 kubectl get nodes
📝 Note: Always upgrade the control plane first, then worker nodes. Never upgrade all nodes simultaneously in production. Use a rolling upgrade, wait for each node to become Ready before moving to the next.
7. Real Production Impact
Features in a changelog tell you what changed. But what does v1.36 actually mean for how your cluster behaves on a Tuesday afternoon when things are busy?
Clusters with High Pod Churn
If you’re running batch workloads, CI/CD pipelines, or anything that creates and destroys a large number of pods frequently, you’ll feel the scheduler improvements most. QueueingHint should reduce the scheduler queue backup that teams have sometimes seen with 1,000+ pending pods. The kubelet startup improvements also mean that nodes coming back after evictions or restarts will rejoin the cluster faster and start taking workloads sooner.
AI/ML Workloads
DRA graduating to stable is the headline feature for teams running GPU-heavy workloads. If your infrastructure team hasn’t yet evaluated DRA as a replacement for device plugins, v1.36 is a good moment to start that conversation. The stable API means you can build tooling and workflows around it without worrying about breaking changes.
Security Posture
The removal of legacy service account token auto-mounting is a security hardening change. If your workloads were relying on automatically mounted tokens without explicit configuration, you may see permission errors after upgrading. Audit your workloads with kubectl get pods -o jsonpath to identify any that rely on automounted tokens before upgrading.
Day-2 Operations
The Node Log Query API is a quality-of-life improvement that will be appreciated by on-call engineers. Being able to pull node-level logs from the API without SSH access in the middle of an incident is genuinely useful, particularly in environments with strict access controls.
8. Should You Upgrade to Kubernetes v1.36?
The honest answer depends on your situation. Here’s how to think about it.
Good reasons to upgrade now
- You are running v1.34.x and approaching end of support for that minor version.
- You are running AI/ML workloads and want to take advantage of stable DRA.
- You’ve been hit by the scheduling latency issues that QueueingHint addresses.
- You need the sidecar container stable API for your logging or service mesh setup.
- You are using the in-tree vSphere plugin and need to plan your CSI migration — better to start now than be forced to in v1.37.
Reasons to wait a bit longer
- Your managed Kubernetes provider (EKS, GKE, AKS) hasn’t certified v1.36 yet — wait for their supported release.
- You have tight change freeze windows coming up. Don’t do a Kubernetes upgrade during a freeze.
- You are heavily reliant on an Operator or third-party tool that hasn’t released a v1.36-compatible version yet. Check their release notes.
- You haven’t yet migrated away from the deprecated APIs listed above. Fix that first.
💡 Best practice: Upgrade staging or a non-critical cluster first. Run it for 2–4 weeks under realistic load before upgrading production. Watch for pod restart patterns, scheduler behavior, and any unexpected API errors.
9. Conclusion
Kubernetes v1.36 is a solid, well-rounded release. It doesn’t have one dramatic headline feature — instead, it delivers on a set of improvements that have been in progress for several release cycles. Dynamic Resource Allocation going stable is the biggest news for infrastructure teams supporting AI/ML platforms. The scheduler improvements with QueueingHint are the most impactful change for general workloads at scale. And sidecar containers going stable finally gives teams a clean API for a pattern they’ve been implementing with workarounds for years.
The deprecations and removals aren’t a surprise — they’ve been signaled for multiple releases — but they do require action before you upgrade. Audit your manifests, update your Helm charts, and test in a lower environment first.
Overall, this release is worth upgrading to. The improvements in scheduler efficiency, storage observability, and network policy expressiveness make it genuinely better than v1.35 for production use. Plan your upgrade with care, follow the steps above, and you’ll be in good shape.
🔗 Related Articles You Might Find Useful
- ChatGPT for Kubernetes Troubleshooting: Real Production Issues & DevOps Fixes
- Kubernetes Architecture Explained: A Complete Visual Guide
- Kubernetes v1.35 Explained: Complete Guide to All New Features, Enhancements & API Changes (2026)
- 8 Common Kubernetes Pod Errors Explained (CrashLoopBackOff, ImagePullBackOff & Fixes)
- ChatGPT for Kubernetes Troubleshooting: Real Production Issues & DevOps Fixes
FAQ
What is new in Kubernetes v1.36?
Kubernetes v1.36 introduces Dynamic Resource Allocation (DRA) as stable, QueueingHint for improved pod scheduling performance, stable sidecar containers, enhanced NetworkPolicy expressiveness, a stable Node Log Query API, and a beta Persistent Volume Health Monitor. Several APIs and in-tree volume plugins have also been deprecated or removed.
Is Kubernetes v1.36 stable and production-ready?
Yes. Kubernetes v1.36 is a general availability (GA) release and is considered production-ready. As with any Kubernetes upgrade, you should test in a non-production environment first and review the deprecations before upgrading your production clusters.
How do I upgrade my Kubernetes cluster to v1.36?
The upgrade process involves backing up etcd, running kubeadm upgrade plan to verify the upgrade path, applying the control plane upgrade with kubeadm upgrade apply v1.36.x, and then rolling out the upgrade to worker nodes one at a time. Full steps are covered in the upgrade guide section above.
Can I skip from v1.34 directly to v1.36?
No. Kubernetes only supports upgrading one minor version at a time. If you are on v1.34.x, you must upgrade to v1.35.x first, then to v1.36.x. Skipping minor versions is not supported and can cause unexpected issues.
What APIs were removed in Kubernetes v1.36?
CSIStorageCapacity v1beta1 has been fully removed — use CSIStorageCapacity v1 instead. Legacy service account token auto-mounting without explicit opt-in has also been removed. Update any manifests or Helm charts that reference these before upgrading.
What is Dynamic Resource Allocation (DRA) in Kubernetes?
Dynamic Resource Allocation is a Kubernetes API that provides a more flexible way to request and share specialized hardware resources like GPUs and FPGAs between pods. Unlike the older device plugin model, DRA allows vendors to describe device capabilities in detail and supports resource sharing. It graduates to stable in v1.36.
What are sidecar containers in Kubernetes v1.36?
Sidecar containers are containers defined in initContainers with restartPolicy: Always that run alongside the main application container for the entire lifetime of the pod. They are useful for log forwarding agents, service mesh proxies, and secret injection tooling. This feature is now stable in v1.36.
Does Kubernetes v1.36 improve performance?
Yes. The QueueingHint feature reduces unnecessary scheduling cycles, improving scheduling throughput notably in clusters with large numbers of pending pods. The kubelet has also been improved to parallelize some startup operations, reducing node recovery time after restarts. Additionally, API server and etcd interaction overhead has been reduced for clusters with high object churn.
About the Author
Kedar Salunkhe
DevOps Engineer | Seven years of fixing things that break at 2am
Kubernetes • OpenShift • AWS • Coffee
I’ve spent almost 7 years keeping production systems running, often when everyone else is asleep. These days I’m working with Kubernetes and OpenShift deployments, automating everything that can be automated, and occasionally remembering to document the things I fix. When I’m not troubleshooting clusters, I’m probably trying out new DevOps tools or explaining to someone why we can’t just “restart everything” as a debugging strategy. You can usually find me where the coffee is strong and the error logs are confusing.