Innovations in GKE and Open Source at KubeCon EU 2026

As the cloud-native community convenes in Amsterdam for KubeCon + Cloudnativecon Europe, we are thrilled to showcase our ongoing efforts to enhance both the open-source Kubernetes ecosystem and Google Kubernetes Engine (GKE). Here’s a summary of our latest developments.

Autopilot for All Clusters

Five years ago, GKE Autopilot was launched to simplify scaling and infrastructure management. Previously, users faced a tough choice between GKE Autopilot and Standard modes at cluster creation. Now, every cluster can utilize Autopilot, allowing users to activate it on a per-workload basis. This flexibility is powered by the Container-Optimized Compute Platform (COCP), which ensures scalable compute resources are available as needed.

Additionally, we are excited to announce the open-source release of the GKE Cluster Autoscaler, a vital component for infrastructure provisioning, aimed at benefiting the open-source community.

Advancing CNCF Kubernetes AI Conformance

In response to the industry's shift towards large-scale AI, we launched the CNCF Kubernetes AI Conformance program to standardize AI workloads on Kubernetes. GKE is now certified as an AI-conformant platform, enabling seamless model and tool portability across environments.

Looking ahead to the upcoming v1.36 Kubernetes release, the community is proposing new requirements to enhance AI serving capabilities, including advanced inference ingress and high-performance networking.

Introducing the Model Context Protocol

To enhance interactions between AI agents and Kubernetes, we introduced the open-source GKE Model Context Protocol (MCP) Server. This standardized interface simplifies the management and monitoring of workloads and resources, facilitating better integration with various AI clients.

Kubernetes as an AI Infrastructure

llm-d, now a CNCF Sandbox project, represents a significant advancement in transforming Kubernetes into a robust AI infrastructure. Launched in collaboration with industry leaders, llm-d provides a distributed inference framework that is both hardware-agnostic and vendor-neutral, addressing complex orchestration challenges.

Dynamic Resource Allocation (DRA)

As hardware becomes more specialized, Dynamic Resource Allocation (DRA) offers a standardized way to describe unique hardware configurations. We are proud to announce the open-source release of our DRA driver for TPUs, a key milestone for AI workload portability in Kubernetes.

Enhancing Agent Performance

With the rise of agentic AI, Kubernetes is positioned as the ideal platform for running these agents. We are investing in open-source inference solutions, including the Kubernetes Agent Sandbox for secure isolation and GKE Pod Snapshots for improved startup latency.

Ray on Kubernetes: New Features

Ray has emerged as a leading framework for scaling AI workloads, and we are pleased to announce TPU support in Ray v2.55. To enhance user experience, we are introducing the Ray History Server, which allows for debugging and performance optimization after job completion.

Visit Us at KubeCon

Whether you are focused on scaling AI inference or optimizing cluster resources, we are dedicated to making Kubernetes and GKE your go-to platform. Join us at the Google Cloud booth (#310) for in-depth discussions and engaging activities.