This AI Paper Introduces Agents: An Open-Source Python Framework for Autonomous Language Agents

3 Mins read

In tasks like customer service, consulting, programming, writing, teaching, etc., language agents can reduce human effort and are a potential first step toward artificial general intelligence (AGI). Recent demonstrations of language agents’ potential, including AutoGPT and BabyAGI, have sparked much attention from researchers, developers, and general audiences. 

Even for seasoned developers or researchers, most of these demos or repositories are not conducive to customizing, configuring, and deploying new agents. This restriction results from the fact that these demonstrations are frequently proof-of-concepts that highlight the potential of language agents rather than being more substantial frameworks that can be used to gradually develop and customize language agents. 

Furthermore, studies show that the majority of these open-source sources cover only a tiny percentage of the basic language agent abilities, such as job decomposition, long-term memory, web navigation, tool usage, and multi-agent communication. Additionally, most (if not all) of the language agent frameworks currently in use rely exclusively on a brief task description and entirely on the ability of LLMs to plan and act. Due to the high randomness and consistency across different runs, language agents are difficult to modify and tweak, and the user experience is poor.

Researchers from AIWaves Inc., Zhejiang University, and ETH Zürich present AGENTS, an open-source language agent library and framework to support LLM-powered language agents. The goal of AGENTS is to make language agent customization, tuning, and deployment as straightforward as possible—even for non-specialists—while yet being easily expandable for programmers and researchers. The library also offers the core capabilities listed below, which combine to make it a flexible platform for language agents: 

Long-short-term memory: AGENTS incorporate the memory components, allowing language agents to routinely update a short-term working memory with a scratchpad and store and retrieve long-term memory using VectorDB and semantic search. Users can decide whether to give an agent long-term memory, short-term memory, or both by simply filling up a field in the configuration file. 

Web navigation and the use of tools: The capability of autonomous agents to use external tools and browse the internet is another crucial characteristic. AGENTS supports a few widely used external APIs and offers an abstract class that makes it simple for programmers to incorporate other tools. By classifying web search and navigation as specialized APIs, we also make it possible for agents to browse the internet and gather information. 

Multiple-agent interaction: AGENTS permit customizable multi-agent systems and single-agent capabilities, which might be useful for specific applications like games, social experiments, software development, etc. The “dynamic scheduling” function in AGENTS is one new addition for multi-agent communication. Dynamic scheduling allows establishing a controller agent that serves as a “moderator” and chooses which agent to conduct the next action based on their roles and recent history instead of scheduling the order for the agents to act with hard-coded rules. The possibility exists for more flexible and natural communication between several agents when using dynamic scheduling. By defining the controller’s rule in the configuration file using plain language, developers can quickly alter the controller’s behavior. 

Human-agent interaction is supported by AGENTS in both single-agent and multi-agent scenarios, enabling interaction and communication between one or more humans and language agents.

Controllability: Using a symbolic plan, often known as standard operating procedures (SOPs), AGENTS offer a revolutionary paradigm for developing controllable agents. An SOP is a graph with several states that describes the various circumstances an agent might face while carrying out a task and the rules for transitioning between the states. An SOP in AGENTS is a painstakingly recorded collection of detailed instructions that specify how an agent or group of agents should carry out a specific activity or procedure. This is similar to SOPs in the real world. An LLM can produce SOPs that the user can alter while personalizing and fine-tuning the agent. After deployment, an agent will function by the instructions and standards set forth for each state and dynamically change its present state in response to interactions with the outside world, people, or other agents. With the advent of the symbolic plan, it is now possible to provide fine-grained control over an agent’s behavior, improving its stability and predictability while facilitating tuning and agent optimization.

The team hopes that AGENTS make it easier for researchers to study language agents, developers to create applications utilizing language agents, and non-technical audiences to create and modify unique language agents. 

Check out the Paper and GithubAll Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.

Source link

Related posts

Google AI Described New Machine Learning Methods for Generating Differentially Private Synthetic Data

3 Mins read
Google AI researchers describe their novel approach to addressing the challenge of generating high-quality synthetic datasets that preserve user privacy, which are…

Planning Architectures for Autonomous Robotics

3 Mins read
Autonomous robotics has seen significant advancements over the years, driven by the need for robots to perform complex tasks in dynamic environments….

This AI Paper from Stanford University Evaluates the Performance of Multimodal Foundation Models Scaling from Few-Shot to Many-Shot-In-Context Learning ICL

3 Mins read
Incorporating demonstrating examples, known as in-context learning (ICL), significantly enhances large language models (LLMs) and large multimodal models (LMMs) without requiring parameter…



Leave a Reply

Your email address will not be published. Required fields are marked *