At GTC last week, Jensen Huang told 30,000 attendees that future data centers will become “token factories,” a concept that aligns with the mission of a small Israeli startup, NeuReality. This company, based in Caesarea and known for its NR-NEXUS inference operating system, recently appointed Shalini Agarwal, a Google Labs product management director, as a strategic adviser for market strategy, according to a press release on Monday.
This move signifies NeuReality’s growing ambitions, transitioning from custom silicon design for AI inference to software that transforms scattered GPU clusters into efficient production-grade inference systems.
Agarwal brings nearly 20 years of product strategy experience at major tech companies, having led AI-focused product management at Google Labs. Before that, she spent almost ten years at eBay and holds degrees from MIT in computer science, electrical engineering, and management science. Though her role is advisory, Agarwal joins NeuReality’s leadership alongside co-founder and CEO Moshe Tanach and president Hiren Majmudar, a former GlobalFoundries and Intel Capital executive who joined in September 2024.
Timely in execution, NeuReality launched NR-NEXUS on March 12, promoting it as a hardware-agnostic operating system for AI factories. This platform distributes tasks across varied hardware, including GPUs, CPUs, and network interface cards, maximizing the use of valuable accelerators that often remain underutilized. The company has beta customers testing the software but has yet to disclose which organizations are involved.
NeuReality’s product emerges at a time when inference economics are a crucial focus in enterprise AI. Deloitte predicts that by 2025, inference workloads accounted for half of AI computing and are forecasted to reach two-thirds this year. Major companies are investing heavily, with Amazon planning $200 billion in spending for 2026 and Google budgeting between $175 billion and $185 billion, as per recent earnings reports. However, much of this investment goes through a few vertically integrated stacks, limiting options for enterprises wanting to run inference on diverse hardware.
NeuReality aims to fill this gap with NR-NEXUS, designed for any CPU, GPU, or NIC, including NVIDIA’s upcoming Vera Rubin architecture. The target market includes neocloud providers, enterprises developing their own inference capabilities, and semiconductor vendors seeking a complete software layer for their chips.
The company has raised about $70 million so far, including a $35 million Series A round in late 2022 led by Samsung Ventures, with support from investors like OurCrowd and SK Hynix, followed by a $20 million round in March 2024, led by the European Innovation Council Fund and existing backers. EU support positions NeuReality within a wider European AI infrastructure initiative, although its engineering hub remains in Israel.
Agarwal’s role is focused on market strategy rather than product engineering, acknowledging that developing an inference operating system is only part of the challenge. The other challenge is convincing infrastructure buyers, many of whom have strong ties to NVIDIA’s software ecosystem, that the startup’s orchestration layer is worth integrating.
The success of NR-NEXUS will rely on execution amidst competition from well-funded startups. Modal Labs is reportedly being valued at $2.5 billion, Baseten secured a $300 million round at $5 billion, and Fireworks AI garnered $250 million. While each approaches inference optimization differently, all target the same opportunity: as AI shifts from training to deployment, controlling the inference layer means controlling more of the value chain.
For NeuReality, bringing in an adviser with Google-level product insight may seem modest on paper. However, it reflects a strategic bet that the next phase of AI infrastructure will reward entities that can effectively connect silicon to the enterprises requiring scalable and efficient model deployment across existing hardware.
