Large Language Models (LLMs) have revolutionized the field of artificial intelligence, demonstrating remarkable capabilities in various tasks. However, to fully harness their potential, LLMs must be equipped with the ability to interact with the real world through tools. As the number of available tools continues to grow, effectively identifying and utilizing the most relevant tool for a given task becomes a critical challenge. Existing retrieval methods, such as BM25 and dense retrieval techniques, provide foundational solutions but need to be improved in their adaptability and precision when dealing with large or complex toolsets. These methods often fail to scale effectively, struggle with nuanced user intents, and need to utilize the contextual understanding capabilities of modern LLMs fully.
A team of researchers from Google proposed Re-Invoke, an innovative, unsupervised tool retrieval framework that leverages LLMs to match user intents with tools better. The system enriches tool documents with synthetic queries generated by LLMs, improving their representation in an embedding space. User queries are then analyzed to extract underlying intents, which are encoded into the same space. By leveraging the power of LLMs, Re-Invoke addresses the critical need for a scalable and effective tool retrieval solution, enabling LLMs to unlock their full potential and deliver more comprehensive and accurate responses.
Re-Invoke has two core components:
- Query Generator: LLMs generate diverse synthetic queries relevant to each tool. These queries are appended to the original tool documents, creating augmented tool documents that represent the tool’s capabilities comprehensively. The documents are encoded into an embedding space to facilitate similarity-based retrieval.
- Intent Extractor: User queries are processed by LLMs to identify core tool-related intents. These intents are also encoded into the embedding space to ensure compatibility with the tool representations.
Re-Invoke employs multi-view similarity ranking, where similarity scores are calculated between extracted intents and augmented tool documents. The system aggregates these scores across multiple intents to rank tools, prioritizing the most relevant tools.
In evaluations, Re-Invoke demonstrated significant performance gains over state-of-the-art methods, achieving a 20% relative improvement in nDCG@5 for single-tool retrieval and a 39% improvement in multi-tool retrieval on the ToolE dataset. Integrating with LLM agents led to higher task success rates, confirming its efficacy in real-world applications.
In conclusion, Re-Invoke presents a robust, unsupervised framework to address the tool retrieval problem for LLMs, combining intent extraction and document augmentation to achieve superior results. Its scalable design, enhanced retrieval accuracy, and adaptability to diverse toolsets make it a valuable advancement in the field, empowering LLMs to interact more effectively with external tools and deliver higher performance across tasks.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.
Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Kharagpur. She is a tech enthusiast and has a keen interest in the scope of software and data science applications. She is always reading about the developments in different field of AI and ML.