Recent advancements in database technology have introduced AI-powered SQL functions that utilize large language models (LLMs) to interpret natural language queries. This innovation allows users to ask complex questions, such as identifying negative product reviews or tracking resolved customer support tickets.
However, the use of LLMs has been limited due to high costs and performance issues, with LLM calls adding significant latency and expense to queries. To address these challenges, Google Cloud has developed proxy models, which are lightweight, cost-optimized models designed for specific queries. These models can replace most LLM calls during query execution, providing a faster and more economical solution.
Understanding Proxy Models
Proxy models leverage rich data embeddings generated by advanced embedding models like Gemini. By processing these embeddings, proxy models can deliver semantic understanding similar to that of LLMs while maintaining low latency and costs. The key advantage lies in the fact that embeddings are created once and reused, significantly reducing the overall computational burden.
Despite their efficiency, proxy models are approximations and may not match the performance of LLMs in all scenarios. They excel in cases where semantic patterns can be detected but may struggle with more complex reasoning tasks.
How Proxy Models Operate
To illustrate the functionality of proxy models, consider a query that filters movie reviews based on specific criteria. The process involves several steps:
- Creating a training set from the input data.
- Using an LLM to label the data as relevant or not.
- Training a proxy model based on these labels.
- Evaluating the model's performance on a test set.
- Deciding whether to use the proxy model or revert to LLM inference based on evaluation results.
This process can be executed on-the-fly in BigQuery, while AlloyDB precomputes proxy models to optimize performance further.
Performance Insights
Research indicates that proxy models can achieve accuracy levels comparable to LLMs across various benchmarks. In many cases, they even outperform LLMs due to their specialized training on multiple samples. However, they face limitations in scenarios requiring extreme selectivity, where the availability of training examples is scarce.
Comparing Proxy Models and Vector Search
While proxy models might seem similar to vector search methods, they serve different purposes. Proxy models act as classifiers, providing specific outputs based on input data, while vector search relies on generic distance functions and lacks the tailored approach of proxy models.
Conclusion
As AI functions become integral to database operations, the development of proxy models represents a significant leap forward in efficiency and cost-effectiveness. By utilizing these models, organizations can enhance their SQL queries, making them faster and more affordable. Ongoing research aims to refine these models further, potentially expanding their applicability to more complex queries.