An OpenAI spinoff has built an AI model that helps robots learn tasks like humans

2 Mins read


The new model, called RFM-1, was trained on years of data collected from Covariant’s small fleet of item-picking robots that customers like Crate & Barrel and Bonprix use in warehouses around the world, as well as words and videos from the internet. In the coming months, the model will be released to Covariant customers. The company hopes the system will become more capable and efficient as it’s deployed in the real world. 

So what can it do? In a demonstration I attended last week, Covariant cofounders Peter Chen and Pieter Abbeel showed me how users can prompt the model using five different types of input: text, images, video, robot instructions, and measurements. 

For example, show it an image of a bin filled with sports equipment, and tell it to pick up the pack of tennis balls. The robot can then grab the item, generate an image of what the bin will look like after the tennis balls are gone, or create a video showing a bird’s-eye view of how the robot will look doing the task. 

If the model predicts it won’t be able to properly grasp the item, it might even type back, “I can’t get a good grip. Do you have any tips?” A response could advise it to use a specific number of the suction cups on its arms to give it better a grasp—eight versus six, for example. 

This represents a leap forward, Chen told me, in robots that can adapt to their environment using training data rather than the complex, task-specific code that powered the previous generation of industrial robots. It’s also a step toward worksites where managers can issue instructions in human language without concern for the limitations of human labor. (“Pack 600 meal-prep kits for red pepper pasta using the following recipe. Take no breaks!”)

Lerrel Pinto, a researcher who runs the general-purpose robotics and AI lab at New York University and has no ties to Covariant, says that even though roboticists have built basic multimodal robots before and used them in lab settings, deploying one at scale that’s able to communicate in this many modes marks an impressive feat for the company. 

To outpace its competitors, Covariant will have to get its hands on enough data for the robot to become useful in the wild, Pinto told me. Warehouse floors and loading docks are where it will be put to the test, constantly interacting with new instructions, people, objects, and environments. 

“The groups which are going to train good models are going to be the ones that have either access to already large amounts of robot data or capabilities to generate those data,” he says.

Source link

Related posts

GENAUDIT: A Machine Learning Tool to Assist Users in Fact-Checking LLM-Generated Outputs Against Inputs with Evidence

2 Mins read
[ad_1] With the recent progress made in the field of Artificial Intelligence (AI) and mainly Generative AI, the ability of Large Language…

This AI Paper from the University of Oxford Proposes Magi: A Machine Learning Tool to Make Manga Accessible to the Visually Impaired

2 Mins read
[ad_1] In storytelling, Japanese comics, known as Manga, have carved out a significant niche, captivating audiences worldwide with their intricate plots and…

The Dawn of Grok-1: A Leap Forward in AI Accessibility

2 Mins read
[ad_1] In an era where the democratization of artificial intelligence technology stands as a pivotal turning point for innovation across industries, xAI…



Leave a Reply

Your email address will not be published. Required fields are marked *