LLM-powered systems are gradually entering production, yet this transition poses challenges not typical in traditional software practices. The non-deterministic nature of models and agents complicates testing changes, understanding failures, and confidently deploying updates. This has led to a demand for new evaluation tools crafted specifically for the distinct qualities of LLMs.
Comet is a platform bringing Roots and MLOps into the swiftly changing domain of agent-based systems by treating prompts, tools, and workflows as components that can be optimized and enhanced over time.
Gideon Mendels is the co-founder and CEO of Comet. He previously contributed to Google on hate speech and deception detection and founded GroupWise, which developed and implemented NLP models for billions of chat interactions. In this episode, Gideon joins Kevin Ball to explore how agent development bridges software engineering and ML, the critical role of eVals for many AI teams, prompt optimization viewed as a search challenge, and the future of continuously enhancing agents in production.
Full Disclosure: This episode is sponsored by Comet.

Kevin Ball, or KBall, is the vice president of engineering at Mento and offers independent coaching for engineers and engineering leaders. He co-founded and was CTO for two firms, established the San Diego JavaScript meetup, and organizes the AI inaction discussion group through Latent Space.
Please click here to see the transcript of this episode.