Kubernetes v1.36: All Stable, Beta & Alpha Features Explained (Complete DevOps Guide)

Last Updated: March 2026

Every Kubernetes release tells you something about where the community’s head is at. With Kubernetes v1.36, the message is pretty clear: the project is maturing, and the focus has shifted toward making things that already exist work better — more reliably, more efficiently, and more securely. This isn’t a release full of flashy new concepts. It’s a release that graduates important work into stable, moves promising features into beta where teams can start building on them, and introduces a handful of genuinely interesting alpha experiments worth watching.

For DevOps engineers and platform teams, v1.36 is a meaningful release. The improvements touch scheduling, storage, networking, security, and node management — all the areas where production pain tends to live. Whether you’re managing a single cluster or running a platform that hosts dozens of them, the changes in this release will affect how you operate day to day.

This guide covers every stable, beta, and alpha feature in Kubernetes v1.36 with practical explanations, real examples, and the context you need to decide what to adopt and when. It also covers deprecations, removals, the upgrade path, and post-upgrade verification so you have everything in one place.


Kubernetes v1.36 Release Overview

What is Kubernetes v1.36?

Kubernetes v1.36 is the latest stable release of the Kubernetes container orchestration platform, maintained by the Cloud Native Computing Foundation (CNCF). It follows the project’s typical release cadence of three to four minor versions per year. Like all Kubernetes releases, v1.36 includes features at three maturity stages: stable (GA), beta, and alpha — each representing a different level of readiness for production use.

What makes v1.36 particularly notable is the breadth of features graduating from beta to stable. Several features that DevOps teams have been cautiously using in beta for one or two release cycles are now fully supported, which means you can rely on them in production without worrying about breaking API changes in future releases.

Kubernetes v1.36 Release Highlights

  • Dynamic Resource Allocation (DRA) graduates to stable — a significant milestone for GPU and hardware accelerator workloads
  • Sidecar containers are now stable, giving teams a proper API for a pattern they’ve been hacking around for years
  • QueueingHint for the scheduler is stable, meaningfully improving scheduling throughput at scale
  • Node Log Query API is stable — no more needing SSH access to pull node-level logs
  • Enhanced NetworkPolicy expressiveness arrives in beta
  • New volume health monitoring capabilities in beta
  • Several in-tree volume plugins deprecated in favor of CSI drivers

Key Focus Areas of This Release

CategoryDescriptionCount
Stable FeaturesProduction-ready features that have completed the graduation process and carry API stability guarantees6 features
Beta FeaturesFeatures under active testing, enabled by default in most cases, suitable for non-critical production use with awareness of potential changes5 features
Alpha FeaturesExperimental features disabled by default, intended for testing and feedback, not recommended for production use4 features
Deprecated APIsAPIs that still function but are marked for removal in a future release3 items
Removed FeaturesAPIs and features that have been completely removed and will no longer work2 items

What’s New in Kubernetes v1.36

Major Improvements in Kubernetes v1.36

The headline story across this release is maturity. Six features have graduated to stable, which is a higher number than most recent releases. This reflects several cycles of investment in DRA, sidecar containers, and scheduling improvements that are now ready to be relied upon in production environments. Beyond those graduations, the beta and alpha feature sets introduce improvements in areas that have historically been underserved — particularly around network policy flexibility and storage observability.

There’s also a notable cleanup effort in v1.36. The removal of CSIStorageCapacity v1beta1 and the deprecation of several remaining in-tree volume plugins continue the multi-release push to get everyone onto CSI drivers. This is housekeeping, but it’s the kind of housekeeping that breaks things if you don’t pay attention.

Performance Enhancements

The two biggest performance improvements in v1.36 both relate to the scheduler and the kubelet. On the scheduler side, the stable landing of QueueingHint changes how the scheduler decides when to retry pods that couldn’t be scheduled. Instead of re-evaluating the entire pending pod queue whenever a cluster event fires, the scheduler now only re-evaluates pods that are likely to be affected by that specific event. In practice this means much lower scheduler CPU usage and shorter scheduling latency when you have a large backlog of pending pods.

On the kubelet side, several startup operations that previously ran sequentially — particularly around volume setup and pod status synchronization — now run in parallel. The practical effect is that nodes come back to a Ready state faster after restarts, which matters when you’re doing rolling updates on a large cluster or when auto-scaling events trigger new nodes to join quickly.

The API server also benefits from improvements to how it handles watch events from etcd. In clusters with high object churn, the previous behavior could cause API server CPU spikes during busy periods. The v1.36 changes smooth this out considerably.

Security Improvements

The most significant security change in v1.36 is not a new feature — it’s a removal. Legacy service account token auto-mounting without explicit opt-in has been removed. This was a longstanding security concern because pods could silently receive a service account token they didn’t explicitly request, creating an unnecessary attack surface. With this removal, service account token usage must be deliberate.

On the positive side, improvements to the Node Log Query API reduce the need to grant SSH access to nodes for debugging purposes. Fewer humans with SSH access to nodes is a meaningful security improvement in environments with strict access controls. The enhanced NetworkPolicy expressiveness in beta also makes it easier to write precise, least-privilege network policies without workarounds that often ended up being overly permissive.

Networking Improvements

kube-proxy received fixes for iptables rule ordering edge cases that could cause intermittent connectivity failures under specific NAT configurations. These were subtle bugs that were difficult to reproduce reliably, but real enough that some teams had implemented custom iptables workarounds. If you’re in that camp, review your custom rules after upgrading.

The progress on enhanced NetworkPolicy in beta is also worth noting here. The new expressiveness — including CIDR ranges with exceptions, port ranges, and improved namespace selectors — addresses gaps that have existed in the NetworkPolicy API since its introduction. This is the foundation for significantly simpler and more maintainable network security configurations.


Kubernetes v1.36 Stable Features

Kubernetes v1.36 Stable Features

Stable features in Kubernetes carry API stability guarantees. Once a feature is stable, the API will not change in backward-incompatible ways. These are the features you should feel confident adopting in production.

Feature 1 – Dynamic Resource Allocation (DRA)

What the Feature Does

Dynamic Resource Allocation provides a new Kubernetes API for requesting, allocating, and sharing specialized hardware resources — like GPUs, FPGAs, and networking accelerators — between pods. It replaces the older device plugin model with a richer, more expressive API that gives hardware vendors much more control over how their devices are described and partitioned. Resources are requested via ResourceClaim objects and referenced from pod specs, giving the scheduler visibility into hardware availability as a first-class scheduling concern.

Why It Matters for Production Clusters

The device plugin model that DRA replaces was functional but inflexible. It treated all devices as interchangeable units with no concept of partial allocation or resource sharing. This worked fine for simple cases but fell apart for AI/ML workloads where you might want multiple inference pods to share a single GPU, or where you need the scheduler to understand the topology of your hardware to make good placement decisions. DRA solves these problems properly. With DRA stable, teams running AI/ML or HPC workloads now have a production-grade API to build their resource management workflows around.

Example Use Case

An inference serving team wants to run multiple small model serving pods on each GPU node, with each pod claiming a slice of the GPU rather than the whole device. With DRA, they define a ResourceClaim that requests a GPU partition, and the scheduler allocates it accordingly — no more manual placement hacks or wasted GPU capacity from exclusive whole-device allocation.

# ResourceClaim requesting a GPU partition via DRA
apiVersion: resource.k8s.io/v1alpha2
kind: ResourceClaim
metadata:
  name: inference-gpu-slice
spec:
  resourceClassName: gpu.example.com
  parametersRef:
    name: small-gpu-partition
---
# Pod referencing the ResourceClaim
apiVersion: v1
kind: Pod
metadata:
  name: inference-server
spec:
  resourceClaims:
  - name: gpu
    source:
      resourceClaimName: inference-gpu-slice
  containers:
  - name: server
    image: inference-server:latest
    resources:
      claims:
      - name: gpu

Feature 2 – Sidecar Containers

How It Works

Sidecar containers are defined using the initContainers field with restartPolicy: Always. This tells Kubernetes to treat the container as a long-running companion to the main application container rather than a one-shot initialization step. Sidecar containers start before the main containers, remain running for the lifetime of the pod, and are terminated after the main containers exit. The lifecycle ordering is deterministic, which was the main problem with the unofficial workarounds teams used before this feature existed.

Benefits for DevOps Teams

Almost every production Kubernetes deployment has sidecars — log forwarders, service mesh proxies, secret injection agents, monitoring collectors. Before this feature was stable, teams had to work around the lack of proper sidecar lifecycle support using Init containers with sleep loops or custom scripts. Those workarounds were fragile, hard to reason about, and occasionally caused surprising behavior during pod termination. The stable sidecar API eliminates all of that. You get predictable startup ordering, predictable shutdown ordering, and a container restart policy that matches what you actually want from a long-running companion process.

# Sidecar container definition — now stable in v1.36
initContainers:
- name: log-forwarder
  image: fluent/fluent-bit:2.1
  restartPolicy: Always       # This is the field that defines it as a sidecar
  volumeMounts:
  - name: app-logs
    mountPath: /var/log/app
containers:
- name: application
  image: my-app:latest
  volumeMounts:
  - name: app-logs
    mountPath: /var/log/app

Feature 3 – Scheduler QueueingHint

How It Works

QueueingHint is a callback mechanism in the scheduler that plugin authors use to tell the scheduler which cluster events are relevant to a previously unschedulable pod. When a relevant event fires — say, a node becomes available, or a resource quota is updated — the scheduler uses the hints to determine which pending pods might now be schedulable and re-queues only those pods. Without QueueingHint, the scheduler re-evaluated every pending pod on every event, which scaled poorly.

Benefits for DevOps Teams

The practical benefit is most visible at scale. If you regularly have hundreds or thousands of pods in a pending state — which is common in batch processing environments, large CI/CD systems, or during auto-scaling events — QueueingHint reduces the CPU overhead of the scheduler significantly and shortens the time between a node becoming available and a pod getting scheduled onto it. For teams that have had to over-provision scheduler resources as a workaround for scheduling latency, this is a meaningful improvement.

Feature 4 – Node Log Query API

What the Feature Does

The Node Log Query API allows cluster operators to query logs from node-level system services — kubelet, containerd, journald, and others — directly through the Kubernetes API server, without needing direct SSH access to the node. Queries support filtering by time range, service name, and log pattern.

Why It Matters for Production Clusters

In most production environments, direct SSH access to nodes is either restricted, audited heavily, or both. When something goes wrong at the node level — a kubelet crash, a containerd issue, a certificate renewal failure — getting logs required either escalating access permissions or having a separate log aggregation pipeline that captured node-level service logs. The Node Log Query API makes this a standard operation that fits within normal Kubernetes RBAC controls. An on-call engineer can query node logs using their existing cluster credentials without any special access.

# Query kubelet logs from a specific node
kubectl get --raw \
  "/api/v1/nodes/worker-node-1/proxy/logs/?query=kubelet&sinceTime=2h"

# Query containerd logs
kubectl get --raw \
  "/api/v1/nodes/worker-node-1/proxy/logs/?query=containerd&tailLines=100"

Kubernetes v1.36 Beta Features

Kubernetes v1.36 Beta Features

Beta features in Kubernetes are enabled by default in most cases and are suitable for use in non-critical production workloads. The API can still change between beta and stable, but breaking changes are rare and are typically accompanied by a migration path. Think of beta as “production-tolerant but not production-guaranteed.”

Beta Feature 1 – Enhanced NetworkPolicy Expressiveness

Feature Overview

The enhanced NetworkPolicy features in beta add a set of long-requested capabilities to the NetworkPolicy API. These include CIDR ranges with exception lists (allow this range, except these specific subnets), named port support in more places, port range selectors, and more granular namespace selectors. The underlying API remains networking.k8s.io/v1/NetworkPolicy — these are additive improvements to an existing API, not a replacement.

Use Cases

The most common use case for CIDR with exceptions is egress policies in environments with partially-trusted internal networks. A common requirement is “allow traffic to the corporate network range, except to the database subnet that applications should never reach directly.” Previously this required two separate NetworkPolicy objects with careful ordering. The new API expresses it in a single rule. Port ranges are useful for workloads that open a dynamic range of ports, like media streaming services or certain monitoring agents.

Limitations

Being in beta, the API may still see changes before it stabilizes. More importantly, NetworkPolicy enforcement is still the responsibility of your CNI plugin — the Kubernetes API only defines the desired state. Not all CNI plugins have implemented support for the new NetworkPolicy fields yet. Check your CNI plugin’s release notes before relying on this feature. Calico, Cilium, and Antrea have varying levels of support at the time of this writing.

# NetworkPolicy with CIDR exception — beta feature
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-corporate-except-db
spec:
  podSelector:
    matchLabels:
      app: web-api
  egress:
  - to:
    - ipBlock:
        cidr: 10.0.0.0/8
        except:
        - 10.20.0.0/16    # Block direct access to database subnet
  policyTypes:
  - Egress

Beta Feature 2 – Persistent Volume Health Monitor

How It Improves Kubernetes

Volume health monitoring allows CSI drivers to report the health status of persistent volumes back to the Kubernetes control plane. The kubelet surfaces this information as pod conditions and events, so operators can see storage health directly in the standard Kubernetes observability toolchain. Previously, a degraded volume would be invisible to Kubernetes until the pod actually started failing — and even then, the errors could be difficult to distinguish from application errors.

When to Use It

This feature is most valuable in environments where persistent storage is business-critical — databases, stateful services, anything where data loss or corruption is a serious concern. If you’re running your own storage infrastructure (Ceph, Portworx, local SSDs) or using a CSI driver from a storage vendor that has implemented health monitoring, enabling this feature gives you earlier warning of storage problems. It’s less relevant for workloads using ephemeral or easily-reproducible storage. Note that your CSI driver must explicitly implement the health monitoring interface for this to do anything — check your driver’s documentation.

Beta Feature 3 – In-Place Pod Resource Resize

Feature Overview

In-place pod resource resize allows the CPU and memory requests and limits of a running pod to be changed without restarting the pod. Before this feature, changing resource requests required deleting and recreating the pod, which caused a disruption. With this feature, you can adjust resources by patching the pod spec, and the kubelet applies the change to the running container where the underlying container runtime supports it.

Use Cases

The most obvious use case is vertical autoscaling. The Vertical Pod Autoscaler (VPA) has historically had to restart pods when it wanted to adjust their resource requests, which was disruptive enough that many teams disabled VPA’s “auto” mode. In-place resize makes VPA-style adjustments non-disruptive. It’s also useful for long-running jobs where you want to throttle down resources after an initial intensive phase without restarting the job.

Limitations

Not all container runtimes support in-place resource changes for all resource types. CPU can generally be resized in-place; memory resize is more constrained because reducing memory limits on a running container can trigger OOM kills if the container is currently using more memory than the new limit. The feature also doesn’t apply to init containers. Test this thoroughly in staging before relying on it for production autoscaling workflows.


Kubernetes v1.36 Alpha Features

Kubernetes v1.36 Alpha Features

Alpha features are experimental, disabled by default, and require explicit feature gate configuration to enable. APIs can change without notice between releases. These features are intended for testing and providing feedback to the Kubernetes community — not for production workloads. That said, understanding what’s in alpha today gives you a good picture of where Kubernetes is heading over the next several releases.

Alpha Feature 1 – Topology Aware Routing Enhancements

Experimental Capabilities

This alpha feature extends Kubernetes’ topology-aware routing capabilities to give more granular control over how traffic is distributed across availability zones and nodes. The existing topology-aware hints mechanism provides a rough preference for routing traffic to local endpoints, but it lacks the ability to express hard requirements or complex multi-zone traffic policies. The new enhancements introduce more expressive topology constraints that can be applied at the Service level, giving platform teams the building blocks for true zone-local traffic routing with controllable fallback behavior.

Expected Future Impact

As cloud costs increase and teams look more carefully at cross-zone data transfer charges, topology-aware routing becomes more economically important. Today, a Kubernetes service that receives traffic in zone A might forward a significant portion of that traffic to pods in zone B, incurring data transfer costs and adding latency. The enhancements in alpha here are laying the groundwork for services that reliably keep traffic within a zone unless capacity requires otherwise. Expect this to move through beta in v1.37–v1.38 as it gets feedback from early adopters.

Alpha Feature 2 – Recursive Read-Only Mounts

Feature Overview

This alpha feature introduces a new volume mount option, recursiveReadOnly, that enforces read-only access not just at the mount point itself but for all bind mounts created within it. The existing readOnly: true option on volume mounts only applies to the top-level mount — it was possible for a process inside a container to create a new bind mount within a read-only volume and gain write access. Recursive read-only mounts close this gap at the kernel level.

Risks of Using Alpha Features

As with all alpha features, the API and behavior of recursive read-only mounts can change between releases, and enabling it requires setting the RecursiveReadOnlyMounts feature gate. Beyond the standard alpha caveats, this feature also requires a Linux kernel version that supports the MS_RDONLY propagation behavior — older node OS images may not support it. Using alpha features in production exposes you to the risk of a future Kubernetes upgrade silently changing or removing the feature, breaking workloads that depend on it. Always have a rollback plan.

Alpha Feature 3 – Pod-Level Resource Limits

Feature Overview

Kubernetes has always set resource limits at the container level, not the pod level. If a pod has three containers and you want to cap the total CPU usage of the pod, you had to split that cap across the three containers — which was awkward and error-prone. This alpha feature introduces pod-level resource fields that allow you to set an aggregate CPU and memory limit for the entire pod, with the individual container limits being enforced as usual underneath it.

Expected Future Impact

This is a feature that many operators have wanted for a long time. It simplifies the resource management story for multi-container pods, particularly pods with sidecars that have variable resource usage. Rather than trying to predict exactly how much CPU the sidecar will use relative to the main container, you set a total pod budget and let the containers use what they need within that budget. This is likely to have a significant quality-of-life impact for teams with complex multi-container pod configurations once it reaches stable.

Alpha Feature 4 – Structured Authentication Config Improvements

Feature Overview

Building on the structured authentication configuration introduced in earlier releases, v1.36 adds alpha support for additional JWT claim validation rules and more flexible user mapping expressions. This allows cluster administrators to define more precise rules for how external identity provider tokens are validated and how principals are mapped to Kubernetes usernames and groups — without having to modify the API server binary or use a webhook authenticator.

Risks of Using Alpha Features

Authentication configuration changes carry higher risk than most feature experiments because misconfiguration can lock you out of your cluster or create unintended access paths. Alpha authentication features should only be tested in clusters where you have a backup authentication method and a recovery plan. The structured config API is likely to change as the community gathers feedback, so production adoption of the alpha fields is strongly discouraged until they graduate to beta.


Deprecated and Removed Features in Kubernetes v1.36

Don’t skip this section. Deprecated and removed features are the most common source of upgrade failures. Check everything here against your manifests, Helm charts, Operators, and custom controllers before you upgrade.

Deprecated APIs

Feature / APIStatusReplacementPlanned Removal
flowcontrol.apiserver.k8s.io/v1beta2Deprecatedflowcontrol.apiserver.k8s.io/v1v1.38
In-tree vSphere volume pluginDeprecatedvSphere CSI Driverv1.37
--pod-max-pids kubelet flagDeprecatedPodPidsLimit via kubelet config filev1.38

Removed Features

Feature / APIStatusReplacement
CSIStorageCapacity v1beta1Fully RemovedCSIStorageCapacity v1
Legacy service account token auto-mounting (implicit)Fully RemovedBound service account tokens with explicit automountServiceAccountToken

âš  Breaking Change:CSIStorageCapacity v1beta1 is completely gone in v1.36. Any manifest, Helm chart, or Operator that references this API version will receive a 404 error after upgrading. Search your entire GitOps repository for this string before upgrading.

Migration Recommendations

For CSIStorageCapacity: Update all references from storage.k8s.io/v1beta1 to storage.k8s.io/v1 for CSIStorageCapacity objects. The v1 API has been available since Kubernetes v1.24, so this should be a straightforward find-and-replace. Check Helm chart dependencies and any custom controllers that interact with storage capacity objects.

For vSphere users: The in-tree vSphere plugin deprecation in v1.36 means removal is coming in v1.37. Begin planning your migration to the vSphere CSI driver now. This is a multi-step process that involves installing the CSI driver, migrating existing PersistentVolumes, and updating your StorageClasses. VMware’s documentation has a detailed migration guide — don’t leave this until the last minute.

For the kubelet --pod-max-pids flag: Move this configuration to your kubelet config file using the PodPidsLimit field. If you’re managing kubelet configuration through a config management tool, update the template now while the deprecated flag still works, and verify the behavior before the removal version ships.

For service account tokens: Audit pods that might have been relying on implicit token mounting. Run the following to find pods without explicit automount configuration:

kubectl get pods -A -o json | jq \
  '.items[] | select(.spec.automountServiceAccountToken == null) | .metadata.name'

âš  vSphere users — act now: The in-tree vSphere plugin is being removed in v1.37, one release after this one. If you’re on vSphere and haven’t started the CSI migration, start today.


Kubernetes v1.36 Upgrade Guide

Kubernetes v1.36 Upgrade Guide

Prerequisites for Upgrading

  • Your cluster must be running v1.34.x or v1.35.x. Kubernetes does not support skipping minor versions. If you are on v1.33.x, upgrade to v1.34.x first.
  • Verify all nodes are in Ready state and no system pods are in CrashLoopBackOff.
  • Audit your manifests, Helm charts, and Operators against the deprecated and removed APIs listed above.
  • Check compatibility of your CNI plugin, CSI drivers, admission webhooks, and custom controllers with Kubernetes v1.36. Review each vendor’s compatibility matrix.
  • Confirm your container runtime (containerd, CRI-O) supports the v1.36 CRI API version.
  • Review any custom feature gate configuration you have set — some feature gates from earlier releases may have changed state or been removed.

Backup Your Kubernetes Cluster

Never upgrade a production cluster without a current etcd backup. etcd contains the entire state of your cluster. If something goes catastrophically wrong during an upgrade, a recent etcd snapshot is your recovery path.

# Create an etcd snapshot before upgrading
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-pre-v136-upgrade.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
  --key=/etc/kubernetes/pki/etcd/healthcheck-client.key

# Verify the snapshot
ETCDCTL_API=3 etcdctl snapshot status /backup/etcd-pre-v136-upgrade.db

Upgrade Using kubeadm

Step 1: Check the upgrade plan

kubeadm upgrade plan

Review the output carefully. It shows the versions you can upgrade to and flags any component version mismatches or configuration issues.

Step 2: Upgrade kubeadm on the control plane node

apt-mark unhold kubeadm
apt-get update && apt-get install -y kubeadm=1.36.x-00
apt-mark hold kubeadm

# Verify the new version
kubeadm version

Step 3: Apply the control plane upgrade

kubeadm upgrade apply v1.36.x

This upgrades the control plane components (API server, controller manager, scheduler) and updates the cluster configuration. It does not upgrade kubelet or kubectl on any node.

Step 4: Upgrade kubelet and kubectl on the control plane node

apt-mark unhold kubelet kubectl
apt-get install -y kubelet=1.36.x-00 kubectl=1.36.x-00
apt-mark hold kubelet kubectl

systemctl daemon-reload
systemctl restart kubelet

Step 5: Upgrade worker nodes one at a time

# Drain the node (run from control plane)
kubectl drain worker-node-1 --ignore-daemonsets --delete-emptydir-data

# On the worker node itself:
apt-mark unhold kubeadm kubelet kubectl
apt-get update && apt-get install -y \
  kubeadm=1.36.x-00 kubelet=1.36.x-00 kubectl=1.36.x-00
apt-mark hold kubeadm kubelet kubectl

kubeadm upgrade node
systemctl daemon-reload && systemctl restart kubelet

# Back on control plane — uncordon the node
kubectl uncordon worker-node-1

Repeat Step 5 for each worker node. Always wait for the node to return to Ready status before moving to the next one.

Post-Upgrade Verification

After completing the upgrade, run through these checks before declaring the upgrade successful.

# Check all nodes are Ready and showing v1.36
kubectl get nodes

# Check all system pods are running
kubectl get pods -n kube-system

# Verify the API server version
kubectl version

# Check for any pods in CrashLoopBackOff or Error state
kubectl get pods -A | grep -v Running | grep -v Completed

# Look for any API deprecation warnings in recent events
kubectl get events -A --field-selector reason=APIDeprecated

💡 Tip: Run your full test suite against the upgraded cluster before putting production traffic on it. Watch scheduler behavior, check that your monitoring and alerting are still functioning, and verify any custom admission webhooks are responding correctly.


Impact of Kubernetes v1.36 on Production Clusters

Performance Impact

For most clusters, the performance improvements in v1.36 will be a net positive with no configuration changes required. Scheduler efficiency improves automatically with the stable QueueingHint implementation, and the kubelet startup improvements are transparent. Teams running clusters at scale — thousands of nodes, tens of thousands of pods — will notice the improvement most clearly. Clusters with modest workloads and low pod churn will see less dramatic changes.

The one area where performance could be temporarily impacted is during the upgrade itself. Rolling control plane and kubelet upgrades cause brief scheduling interruptions. Plan your upgrade windows accordingly and avoid upgrading during peak traffic periods.

Security Enhancements

The removal of implicit service account token mounting is a breaking change for some workloads but a security improvement for all of them. After upgrading, audit your pods to ensure any that need service account token access have explicit configuration for it, and verify that pods which should not have token access are not accidentally receiving it. The Node Log Query API also reduces operational pressure to grant SSH access to nodes, which is a security improvement worth making use of if you have teams that currently need node-level debugging access.

Operational Benefits

The operational quality-of-life improvements in v1.36 are real. The stable sidecar container API eliminates a category of tricky lifecycle bugs that affected log forwarding and service mesh setups. The Node Log Query API makes incident response faster and more contained. Volume health monitoring in beta means you can start getting early warning of storage degradation rather than finding out when your stateful service crashes.

For teams managing AI/ML infrastructure, DRA going stable is the most significant operational change. It means you can build proper GPU resource management workflows with confidence that the underlying API won’t change under you.


Should You Upgrade to Kubernetes v1.36?

Reasons to Upgrade

  • You are running v1.34.x and approaching end of community support for that version.
  • You run GPU or hardware accelerator workloads and want to adopt stable DRA.
  • You’ve experienced scheduling latency issues at scale that QueueingHint addresses.
  • You want the stable sidecar container API for your logging or service mesh setup.
  • You’re using in-place pod resize in beta and want access to the latest improvements.
  • You are on vSphere and need to start your CSI migration before v1.37 forces it.
  • You want to improve security posture by leveraging the Node Log Query API to reduce SSH access to nodes.

When to Wait Before Upgrading

  • Your managed Kubernetes provider (EKS, GKE, AKS) has not yet certified v1.36 — always use the provider’s supported path, not the upstream release directly.
  • You are in a change freeze or approaching a high-traffic period. Kubernetes upgrades are not zero-risk even with careful planning.
  • A critical Operator, CNI plugin, or CSI driver you depend on has not yet released a v1.36-compatible version.
  • You have not yet migrated away from the deprecated and removed APIs listed above — fix that first, then upgrade.
  • Your team is stretched thin and doesn’t have capacity for the post-upgrade monitoring and verification that a production Kubernetes upgrade requires.

💡 Recommended approach: Upgrade your staging or a lower-traffic production cluster first. Give it two to four weeks under realistic load. Watch for unexpected pod restarts, API error rates, and scheduler behavior. Only then upgrade your primary production clusters.


Frequently Asked Questions (FAQ)

What are the major features in Kubernetes v1.36?

The major features in Kubernetes v1.36 include Dynamic Resource Allocation (DRA) graduating to stable for GPU and hardware accelerator management, sidecar containers reaching stable status, QueueingHint for improved scheduler performance at scale, the Node Log Query API going stable, enhanced NetworkPolicy expressiveness in beta, and In-Place Pod Resource Resize in beta. The release also removes CSIStorageCapacity v1beta1 and deprecated implicit service account token mounting.

Is Kubernetes v1.36 stable for production?

Yes. Kubernetes v1.36 is a GA release and is production-ready. The stable features carry API stability guarantees, and the overall release has gone through the standard Kubernetes release process including beta and release candidate phases. As with any Kubernetes upgrade, you should validate in a non-production environment first, review the breaking changes, and upgrade worker nodes in a rolling fashion rather than all at once.

What are the beta features in Kubernetes v1.36?

The notable beta features in Kubernetes v1.36 are enhanced NetworkPolicy expressiveness (CIDR exceptions, port ranges, better namespace selectors), Persistent Volume Health Monitoring (CSI drivers can now report storage health to the Kubernetes control plane), and In-Place Pod Resource Resize (adjust CPU and memory on running pods without a restart). Beta features are enabled by default and suitable for non-critical production use, but their APIs may still change before they reach stable.

How do I upgrade to Kubernetes v1.36?

Upgrading to Kubernetes v1.36 requires starting from v1.34.x or v1.35.x — you cannot skip minor versions. The process involves backing up etcd, running kubeadm upgrade plan to verify the path, applying the control plane upgrade with kubeadm upgrade apply v1.36.x, upgrading kubelet and kubectl on the control plane node, and then rolling the upgrade across worker nodes one at a time using kubectl drain, upgrading the node packages, and kubectl uncordon. Full step-by-step commands are covered in the upgrade guide section above.

What APIs were deprecated or removed in Kubernetes v1.36?

CSIStorageCapacity v1beta1 has been fully removed — use v1 instead. Implicit service account token auto-mounting has been removed. The flowcontrol.apiserver.k8s.io/v1beta2 API is deprecated and will be removed in v1.38. The in-tree vSphere volume plugin is deprecated and scheduled for removal in v1.37. The --pod-max-pids kubelet flag is deprecated in favor of the kubelet config file approach.

What are alpha features and should I use them in production?

Alpha features are experimental capabilities that are disabled by default and require explicit feature gate activation. Their APIs can change or be removed between releases without notice. Alpha features should not be used in production clusters. They are intended for testing, feedback, and evaluation of upcoming functionality. If you want to experiment with alpha features, do so in isolated test clusters only, and never build production workflows that depend on them.


Conclusion

Kubernetes v1.36 is a release that rewards engineers who have been paying attention. The features graduating to stable — DRA, sidecar containers, QueueingHint, Node Log Query API — have been in development for multiple release cycles, and their graduation reflects a level of maturity and community confidence that makes them genuinely ready for production adoption.

For DevOps teams, the most immediately actionable changes are the stability graduations. If you’ve been hesitating to adopt sidecar containers because of beta status concerns, that hesitation is now gone. If you’ve been working around scheduling latency with over-provisioned scheduler resources, QueueingHint should let you right-size that. If your team has been granting SSH access to nodes purely for log access, the Node Log Query API gives you a better option.

The beta features — particularly in-place resize and the NetworkPolicy improvements — are worth evaluating in staging now, with an eye toward adopting them when they reach stable in the next one or two releases. The alpha features, especially pod-level resource limits and topology-aware routing enhancements, signal where Kubernetes is heading and are worth understanding even if you won’t be using them in production any time soon.

On the migration front, act now on the vSphere CSI migration if you haven’t started. One release after this one is not much runway for a storage migration in a production environment.

Overall, the recommendation is to upgrade to v1.36 — but do it thoughtfully. Validate the deprecated API removal doesn’t affect your workloads, test in staging first, and roll the upgrade out gradually. This is a solid release that will leave your cluster in better shape than v1.35.


🔗 Related Articles


About the Author

Kedar Salunkhe

DevOps Engineer | Seven years of fixing things that break at 2am
Kubernetes • OpenShift • AWS • Coffee

I’ve spent almost 7 years keeping production systems running, often when everyone else is asleep. These days I’m working with Kubernetes and OpenShift deployments, automating everything that can be automated, and occasionally remembering to document the things I fix. When I’m not troubleshooting clusters, I’m probably trying out new DevOps tools or explaining to someone why we can’t just “restart everything” as a debugging strategy. You can usually find me where the coffee is strong and the error logs are confusing.

Leave a Comment