Vespa AI: Pushing the Boundaries of Vector Search

Vector search has become an essential tool in current search and retrieval systems, including the RAG pipelines driving many AI applications. The increasing complexity of retrieval system demands is highlighting the limitations of depending solely on a single vector similarity score.

Vespa is a well-known open source search and data serving engine. At the heart of Vespa’s design is tensor-based retrieval, which represents data as tensors instead of simple vectors. This method allows for richer mathematical operations and more adaptable ranking functions, addressing the constraints of a single vector similarity score.

Radu Gheorghe, a software engineer at Vespa, has nearly 12 years of consulting and training experience on Elasticsearch and Solr. In this episode, Radu joins Sean Falconer to explore the inadequacy of vector similarity alone in production, the ability of tensor-based retrieval to support more complex ranking functions, the considerations in chunking and multi-stage re-ranking architectures, and the future directions of AI search.

Full Disclosure: This episode is sponsored by Vespa.

Sean has been an academic, startup founder, and Googler. He has published works on various topics from AI to quantum computing. Currently, Sean is an AI Entrepreneur in Residence at Confluent where he focuses on AI strategy and thought leadership. You can connect with Sean on LinkedIn.

Please click here to see the transcript of this episode.