Gimlet Labs Tackles AI Inference Bottleneck with an Elegant Solution

Gimlet Labs Tackles AI Inference Bottleneck with an Elegant Solution

3 Min Read

Stanford adjunct professor and successful founder Zain Asgar raised an $80 million Series A for a startup tackling the AI inference bottleneck. The round was led by Menlo Ventures.

The company, Gimlet Labs, claims to have created the first “multi-silicon inference cloud,” software that enables AI workloads to run simultaneously across various hardware types, including CPUs, AI-tuned GPUs, and high-memory systems.

“We basically run across whatever different hardware that’s available,” Asgar told TechCrunch.

A single agent can chain multiple steps requiring different hardware: inference is compute-bound, decode is memory-bound, and tool calls are network-bound, as noted by lead investor Menlo’s Tim Tully in a funding blog post.

No chip yet does it all, but as new hardware emerges and older GPUs are reused, “the multi-silicon fleet is ready — it’s just missing the software layer to make it work,” Tully believes Gimlet Labs provides that layer.

Given the current trend of increased compute deployment, McKinsey estimates data center spending could reach nearly $7 trillion by 2030. Asgar notes that existing hardware is used only “15 to 30 percent” of the time.

“You’re wasting hundreds of billions of dollars because idle resources are being left,” he said. “Our goal was to make AI workloads 10x more efficient today.”

Asgar and cofounders Michelle Nguyen, Omid Azizi, and Natalie Serrino developed orchestration software that distributes agentic workloads across all hardware types.

Gimlet Labs claims to triple or decuple AI inference speed at the same cost and power, slicing models to run across different architectures using the best chip for each part.

The company has partnered with NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix.

Gimlet’s product, available as software or an API through Gimlet Cloud, targets large AI model labs and data centers rather than ordinary AI app developers.

The company publicly launched in October with reported eight-figure revenues. Asgar stated the customer base has doubled in four months, including a major model maker and a large cloud computing company.

The cofounders previously worked at Pixie, a startup with an open source observability tool for Kubernetes, acquired by New Relic in 2020 with a $9 million Series A led by Benchmark.

After a chance meeting between Asgar and Tully and angel investments from Stanford professors, VCs became interested. A term sheet arrived after the launch, quickly resulting in an oversubscribed funding round.

With the prior seed, Gimlet has raised $92 million, including angels like Sequoia’s Bill Coughran, Stanford Professor Nick McKeown, former VMware CEO Raghu Raghuram, and Intel CEO Lip-Bu Tan. The company employs 30 people.

Other investors include Factory, who led the seed, Eclipse Ventures, Prosperity7, and Triatomic.

You might also like