Learning to Learn to Walk

Humans, Robots, and the Story of Embodied Intelligence

Do you remember how you learned to walk? For most of us, the answer is “no.”  We might know how to walk, but we do not generally remember how we learned. This is because humans develop our motor skills before the higher level reasoning abilities that would let us explain to ourselves how we learnt them. Adults can study the process of learning to walk academically and explain the process to other adults; but this is an ex post, generalized explanation, rather than one from first-hand experience. Could you explain how you personally learned to walk?

This is an important problem in understanding “embodied intelligence” — the idea that intelligence is closely connected to physical experience, and follows from it. As researchers seek to build artificial general intelligence and superintelligence, the road forks between theories embodied and “disembodied” intelligence.

There is one story of intelligence that suggests that the ability to reason about the world emerges town-down. In this story, higher-level reasoning and language emerge independent of embodiment, and then enable systems to explain the world through raw power, including its physical elements. In this view, superintelligence could be born from a computer cluster, and then come to conquer all of physical reality.

This is the disembodied model, which informs much of the cutting edge work in AI: LLMs do not have bodies, but they build their “sense” of the world from enormous amounts of text data, and at least appear to be meaningfully intelligent in the traditional sense. They’re better than a lot of people at a lot of stuff, and might be better than most people at most things, or everyone at everything not too long from now.

On the other side, the embodiment hypothesis argues that embodiment — the physical substantiation of an intelligent being, able to both perceive and interact with the physical world — is a crucial component of human-level intelligence. Under the embodiment hypothesis, lower level sensorimotor capabilities give way to reasoning. 

Consider the way a human child first experiences the world. All her senses are new, and she perceives the shapes and colors of the world anew. She has no understanding of these things as abstract concepts yet, nor the means to label them as “shapes” or “color.” But eventually, she learns how to control her limbs, and learns to stand up, and walk. She explores her physical surroundings by interaction thanks to her newly acquired sensorimotor skills. She can’t talk about walking, yet. Her understanding of the physical world is intuitive. An understanding of gravity, for instance, isn’t the result of reasoning through Newton’s law of universal gravitation, but rather through experiencing it constantly as a physically existing person on Earth. She might drop something! Our intelligence, in this way, is principally physical, or embodied. 

Before we see artificial general intelligence (AGI) in the wild, we will not be able to know definitively which of these theories of intelligence is correct. We may be surprised! But if the embodiment hypothesis turns out to be correct, then robotics will be front and center in the race to build more and more capable AIs.

One thing that’s clear: in thousands of years of human dreaming, we have always dreamt of embodied intelligence, and those which remind us of ourselves. And a second thing that’s clear: the dramatic successes in the last two years of disembodied intelligence (via LLMs) has ignited a fire underneath efforts to build embodied intelligence.


In mythology, many of these machines were described less like robots in our contemporary understanding, and more like mechanical, artificial people. Among Hephaestus’ many creations was Talos, a giant bronze automaton programmed to patrol and protect the island of Crete. His other creations include metal bulls, horses, various other automata, generally understood to be mechanical, artificial beings Daedalus, a legendary craftsman of Greek mythology, was said to have created various automata or living statues.

The concept of a robot existed long before the engineering capacity to build them. In the 15th century, Leonardo da Vinci designed a mechanical knight, a suit of armor equipped with pulleys that allowed it to move much like a human would. This system could be considered the first design of a humanoid robot. And the design demonstrated a key point about robotic motion: the systems allowing humans to move the way we do are not uniquely biological, and can be approximated or reproduced mechanically. 

These early concepts and designs of robots all pre-date the term ‘robot’, which originated in 1920 from a science fiction play titled “R.U.R.” (Rossum’s Universal Robots) by Czech playwright Karel Čapek. In the play, robots are described as artificial beings made of organic material that work for humans, differing from the mechanical concept of automata commonly found in history and mythology. Perhaps the most important element of Čapek’s description of robots is that their purpose is to perform labor for humans. The term ‘robot’ comes from the Slavic root related to forced labor; the Czech word ‘robota’ meant forced labor like that performed by serfs.

American inventor George Devol designed the Unimate in the 1950s, the world’s first industrial robot. The Unimate consisted of a single arm, hydraulically actuated, with a gripper attached and connected to a base which stored the system’s memory. Devol and entrepreneur Joseph Engelberger started Unimation to commercialize the Unimate. 

In 1961, a Unimate was installed at a General Motors factory in New Jersey and worked on the assembly line. It performed tasks that posed a danger to human workers, such as die casting. The robot made an appearance on the Tonight Show to demonstrate its prowess in tasks like golf and pouring a beer. 

Industrial robots developed rapidly through the latter half of the 20th century. These robots were largely specialized machines designed to be programmable to perform a set of tasks in a specific environment. Advances in robot mechanics and design, such as the creation of the Stanford Arm, a 6-axis articulated robot, enabled greater industrial use. Entrepreneurs and engineers created robots for other applications too, and robots made their way outside the factory into environments like the operating room, the battlefield, and even in space! Research-oriented companies like Boston Dynamics spun out of universities in the 1990s to develop humanoids and quadrupeds.

As for autonomous machines, it’s a parallel story. A few years after the Unimate was brought to market, a robotics project of great consequence was underway in the Bay Area. At the Stanford Research Institute, a team of researchers funded by DARPA were working on Shakey the Robot, the first mobile robot able to reason about its actions and surroundings. Bringing together research from computer vision, navigation, and other fields, Shakey could plan tasks in a more autonomous manner than existing robots. As a research project, Shakey had significant implications for computer science more broadly, leading to, among other things, the A* pathfinding algorithm. 

Today, Shakey, long retired, lives at the Computer History Museum in Menlo Park.

Beginning in 2004, DARPA reprised its role as a key supporter of autonomous robotics with the DARPA Grand Challenge, a series of competitions mostly focused on developing autonomous ground vehicles (though certain challenges focused on autonomy for other domains). Commercially, the race to develop self-driving passenger vehicles accelerated, as companies raised capital and demonstrated quite sophisticated capabilities. In the past year or so, certain U.S. geographies like San Francisco and Austin have seen the introduction of generally available autonomous vehicles for the public. Autonomous robots have found their way into our homes as vacuums, which get more advanced every year. 

But there has been a great divergence between the commercial outcomes of robotics companies and their Silicon Valley peers working on software. At a time when internet companies rapidly changed the world and became the most important part of the economy, robotics companies were less successful. Even beyond commercial outcomes, one can easily see that robots, defined however broadly, are not nearly as ubiquitous as other types of computers, from PCs to smartphones. Operating in the physical world is hard, and it seems that robotics as a field was finding it more difficult to eat the world the way software did. 

In the early 2020s, however, some could sense a shift in robotics. This shift wasn’t necessarily based on a singular breakthrough moment in research, nor on one in productization – there hasn’t been an ‘iPhone moment’ or ‘ChatGPT moment’ in robotics yet, and maybe there never will be. It was, perhaps more than anything else, a vibe shift,  one that resulted from many developments in concert with one another, partially driven by the accelerating pace of artificial intelligence breakthroughs in other modalities like language and image generation. The ability to sit and talk to a computer via ChatGPT or Claude begged the question for many why we still weren’t talking to robots, “face to face.”

From 2022 onwards, a flurry of research papers from leading robotics labs like Stanford, Berkeley, Carnegie Mellon, or Google Deepmind, showed real progress toward advanced capabilities in robot learning. And many of the researchers who authored the notable papers were leaving their research posts to start commercial efforts, which is one sure indication of optimism.

Today, a generation of commercial labs and researchers are pursuing the hypothesis that, much like how scaling data and compute led to breakthroughs in modalities like language, a similar approach could be used for robotic actions. 

The question then becomes how to obtain robot data at scale, and a number of companies, armed with large amounts of capital, are setting out to prove this hypothesis and in the process answer that question – whether through teleoperation, simulation, video data, or some other method or data source. Other companies are focused on developing more capable or affordable hardware for robotics that can serve as a platform for applications of robotic intelligence. Others are taking full stack approaches, building both their own hardware and their own intelligences – in the form of humanoids or some other embodiment altogether.

The thematic shift, here, is that we are focused on building not just robots, but embodied intelligence. And as our own intelligence emerges from the sensorimotor level upwards, we see in embodied intelligence, and how robots learn, a sliver of a reflection of how we learn. Thanks to the AI revolution, we don’t have to tell robots how to walk. We can teach them to learn to walk.  Perhaps we have always conceived of ‘artificial’ intelligences as embodied ones, because it is the type of intelligence that most closely resembles our own. 

The rest of the story of robotics remains to be written, but it’s one whose destination has captured the human imagination for much of history. If mythology, science fiction, and the arc of the modern robotics industry is any indication, embodied intelligence has always been the dream. We want TARS, and C-3PO, and an ancient automaton to patrol Crete on behalf of the gods. 

A new generation of researchers is optimistic that this dream can be realized. And today, they’re armed with the capital and engineering development to chase it. For those who want the intelligence of the future to be embodied, it is full steam ahead.