The surveillance technology industry is currently under scrutiny for negative reasons, with controversies involving the U.S. Immigration and Customs Enforcement using Flock’s camera network for monitoring, and home camera maker Ring facing backlash for features enabling law enforcement to request footage from homeowners. This situation has sparked a wide debate about safety, privacy, and surveillance permissions.
Despite the controversy, market demand persists, and advancements in vision-language models are driving the progress of companies creating new surveillance solutions for premises monitoring.
Matan Goldner, co-founder and CEO of video surveillance startup Conntour, emphasized the importance of ethics in this field, stating that his company is selective about its clientele. Although it may not seem a practical business approach for a young startup, Goldner’s company can afford this selectivity due to having large governmental and publicly-listed clients, including Singapore’s Central Narcotics Bureau.
Goldner explained, “Our major clients enable us to choose them and maintain control… We ensure we know who uses it, what the use case is, and choose what we find moral and legal. We make informed decisions based on clients we feel comfortable with regarding their usage,” he told TechCrunch in an exclusive interview.
This strategy has not hindered Conntour, whose selectivity has attracted investor attention. The company recently secured a $7 million seed round from General Catalyst, Y Combinator, SV Angel, and Liquid 2 Ventures.
Goldner revealed that the funding round concluded in just 72 hours. “I arranged around 90 meetings over eight days, and within three days, we started on Monday and wrapped up by Wednesday afternoon,” he said.
Conntour’s caution might prove prudent, especially considering the increasing capability of AI tools in the industry. The company’s video platform utilizes AI models, allowing security personnel to query camera feeds using natural language to locate any object, person, or scenario in real-time—essentially a specialized search engine for security video feeds. It also autonomously monitors and identifies threats based on predefined rules, issuing automatic alerts.
In contrast to traditional systems reliant on preset definitions to detect objects, motion patterns, or behaviors, Conntour boasts that its system provides flexibility and usability through natural and vision-language models. It can rapidly search recorded footage or live feeds to find, for instance, “instances of someone in sneakers passing a bag in the lobby.”
The platform integrates AI models, enabling users to pose questions about footage and receive text answers accompanied by the corresponding video feeds, along with incident report generation.
The platform’s scalability is a major selling point. Goldner explained that it stands apart from other AI video search services by being designed for efficient scalability to systems with thousands of camera feeds. Remarkably, Conntour’s system can manage up to 50 camera feeds on a single consumer GPU like Nvidia’s RTX 4090.
This is achieved by employing multiple models and logic systems, determining which models and systems to use for each query to minimize computing power use while delivering optimal results.
Conntour claims its system can be implemented fully on-premises, entirely in the cloud, or a combination of both. It can integrate with existing security systems or operate as a standalone surveillance platform.
However, a persistent challenge in the video surveillance industry is that surveillance quality is contingent upon the quality of the captured footage. Poor-quality video, such as from a dimly lit parking lot shot with a low-resolution camera with a dirty lens, makes detail identification difficult.
Goldner noted that Conntour addresses this by providing a confidence score with search results. If a camera feed source is of poor quality, the system will offer results with low confidence levels.
Looking ahead, Goldner stated the primary technical challenge is integrating full LLM capability into the system while retaining efficiency.
“We aim to provide complete natural language flexibility, akin to LLM, to facilitate any query. Simultaneously, efficiency is crucial as processing thousands of feeds is daunting. This contradiction is the foremost technical challenge and hurdle in our field, which we are diligently addressing.”
