During the recent Google Cloud Next event, a suite of new compute capabilities was unveiled, designed to improve performance and reduce costs for both general-purpose and AI workloads.
Why it matters: IT leaders face the challenge of balancing compute investments between agentic AI demands and everyday applications like web servers and databases. The unpredictability of agentic workloads can lead to performance bottlenecks and increased costs, especially during peak usage times.
For instance, a simple vacation search on a global travel platform can trigger a cascade of agentic processes, overwhelming traditional infrastructure. To address these challenges, Google Cloud introduced fluid compute, which dynamically adapts to varying workloads in real-time.
New Compute Features
The following enhancements were announced at the event:
- Google Axion N4A: This new Arm-based CPU offers up to 2x better price-performance compared to current x86 VMs, ideal for cost-sensitive workloads.
- GKE Agent Sandbox: This native sandbox service allows for safe execution of untrusted code, providing scalable, low-latency infrastructure.
- Google Axion C4A.metal: The first Axion bare metal instance, now in preview, supports various workloads without the overhead of nested virtualization.
- C4 Instances: Expanded support for Intel Xeon 6 processors enhances performance for AI workloads.
- Flexible Committed Use Discounts: This feature allows organizations to optimize spending across VM families and regions.
Customer Success Stories
Several organizations have already benefited from these new capabilities:
- Unity: By migrating to N4A instances, Unity achieved a 20% improvement in cost efficiency.
- Deutsche Börse: Modernizing core applications on Google Compute Engine led to a 58% faster time to market.
- WP Engine: Utilizing GKE clusters resulted in a 60% reduction in latency for mobile APIs.
Handling I/O and Latency-Sensitive Workloads
To support both AI and core workloads, Google Cloud emphasizes the importance of efficient data handling. The introduction of C4N and M4N instances aims to alleviate traditional bottlenecks associated with network and storage limits:
- C4N: Offers high throughput for network applications, achieving 95 million packets per second.
- M4N: Designed for memory-intensive databases, it reduces total cost of ownership by over 20%.
- Z4D: New instances optimize I/O-intensive workloads with high-performance local SSDs.
Future Outlook
With these advancements, Google Cloud's fluid compute infrastructure allows foundational workloads and AI processes to coexist without competition for resources. This adaptability is crucial for organizations looking to modernize their operations and scale efficiently.
For those interested in exploring these new features, the Google Cloud console is available for immediate access to spin up virtual machines for upcoming projects.