Google’s Cloud AI Leader Discusses Three Frontiers of Model Capability

As a product VP at Google Cloud, Michael Gerstenhaber focuses on Vertex, Google’s platform for deploying enterprise AI. He has a broad perspective on AI model usage and the necessary steps to unlock agentic AI’s potential.

In my conversation with Michael, one novel idea stood out. He explained that AI models are advancing along three frontiers: raw intelligence, response time, and a cost-related factor regarding deployment at large, unpredictable scales. This perspective on model capabilities is particularly insightful for those aiming to push frontier models forward.

This interview has been edited for length and clarity.

Can you outline your AI experience and current role at Google?

I’ve been involved in AI for about two years. After a year and a half at Anthropic, I joined Google nearly six months ago. As head of Vertex, Google’s developer platform, I mainly serve engineers building their applications. They seek access to agentic patterns and platforms, as well as the inference from the world’s smartest models. I facilitate this but do not provide the applications; that is up to clients like Shopify, Thomson Reuters, among others.

What attracted you to Google?

Google is unique due to its comprehensive integration, from the interface to the infrastructure layer. We can develop data centers, buy electricity, construct power plants, and have proprietary chips and models. We control the inference and agentic layers, offer APIs for memory and interleaved code writing, and have an agent engine ensuring compliance and governance. Additionally, we provide chat interfaces like Gemini enterprise and Gemini chat for consumers. This vertical integration was a key factor in my decision to join.

Techcrunch event

Boston, MA
|
June 9, 2026

Despite company differences, it seems the big labs have similar capabilities. Is it merely a race for more intelligence, or is the situation more complex?

I see three boundaries. Models like Gemini Pro are optimized for raw intelligence. When coding, you want the best code, even if it takes 45 minutes, because it needs maintenance and deployment. Quality is paramount.

Then there’s the latency issue. In customer support, where policy applications matter, you need timely intelligence. Whether allowing a return or upgrading a plane seat, the answer must be quick. Intelligence is only beneficial within the allowable response time before the customer disengages.

Lastly, companies like Reddit or Meta aim to moderate vast internet content. Despite large budgets, scalability and cost-effectiveness are priorities due to unknown subject volumes. Therefore, cost is critical.

I’m curious about why agentic systems aren’t widely adopted yet. The models seem ready, and demos are impressive, but we haven’t seen the expected major shifts. What’s holding them back?

Agentic technology is still young, only around two years old, with missing infrastructure. We lack standardized patterns for auditing agent actions and authorizing data. Developing production-ready patterns takes time, and production often lags behind technological capability. Hence, observing intelligence in production is where challenges arise.

In software engineering, progress is rapid as agentic systems integrate well into the development cycle. Working within a dev environment allows for safe experimentation, followed by testing. Google’s coding requires dual audits, ensuring quality and brand integrity. These human-in-the-loop processes make implementation low-risk. We need to adapt these patterns across various sectors and professions.

Google’s Cloud AI Leader Discusses Three Frontiers of Model Capability

You might also like

Seven Unique Technology Trends That Vanished

Engineering AI Systems for Autonomy and Resilience with Krishna Sai – Software Engineering Daily