Will Scaling Solve Robotics?

The idea of solving the biggest robotics challenges by training large models is sparking debate.

Last year’s Conference on Robot Learning (CoRL) was the biggest CoRL yet, with over 900 attendees, 11 workshops, and almost 200 accepted papers. While there were a lot of cool new ideas (see this great set of notes for an overview of technical content), one particular debate seemed to be front and center: Is training a large neural network on a very large dataset a feasible way to solve robotics?

Of course, some version of this question has been on researchers’ minds for a few years now. However, in the aftermath of the unprecedented success of ChatGPT and other large-scale “foundation models” on tasks that were thought to be unsolvable just a few years ago, the question was especially topical at this year’s CoRL. Developing a general-purpose robot, one that can competently and robustly execute a wide variety of tasks of interest in any home or office environment that humans can, has been perhaps the holy grail of robotics since the inception of the field. And given the recent progress of foundation models, it seems possible that scaling existing network architectures by training them on very large datasets might actually be the key to that grail.

Given how timely and significant this debate seems to be, I thought it might be useful to write a post centered around it. My main goal here is to try to present the different sides of the argument as I heard them, without bias towards any side. Almost all the content is taken directly from talks I attended or conversations I had with fellow attendees. My hope is that this serves to deepen people’s understanding around the debate, and maybe even inspire future research ideas and directions.

I want to start by presenting the main arguments I heard in favor of scaling as a solution to robotics.

Why Scaling Might Work

It worked for Computer Vision (CV) and Natural Language Processing (NLP), so why not robotics? This was perhaps the most common argument I heard, and the one that seemed to excite most people given recent models like GPT4-V and SAM. The point here is that training a large model on an extremely large corpus of data has recently led to astounding progress on problems thought to be intractable just 3-4 years ago.[…]

The idea of solving the biggest robotics challenges by training large models is sparking debate.

Why Scaling Might Work

Share this:

Related