The field of robotics has advanced tremendously in recent years. Groups like Boston Dynamics have made remarkable progress on creating humanoid robots with unprecedented sensing, locomotion, and manipulation capabilities. Their machines can now walk, run, jump, and flip with graceful agility and dexterity that begins to rival humans.
We are also witnessing astounding breakthroughs in artificial intelligence, especially natural language models like GPT-3. These large neural networks can interpret and generate nuanced text at scale, showcasing an impressive capacity for semantic understanding. Some believe these models exhibit early signs of reasoning, creativity, and even rudimentary common sense.
As AI and robotics continue rapid development along complementary trajectories, there is intriguing potential to unite their strengths. One vision is of nimble humanoid robots that act as dexterous physical vessels for large language models. The AI core imbues the machine with substantial knowledge and task comprehension gleaned from its vast training, while custom mechanical design provides capable mobility and manipulation.
Such integration could accelerate progress towards adaptable and trainable generalist robots. For instance, the natural language model can ground abstract instructions into executable robotic scripts. This helps transcend rigid pre-programming by allowing more flexible control through conversational commands over time. The robot’s physical embedding handles motion and perception, while tapping into the knowledge distillation of the AI for higher-level reasoning, planning, and skill building.
The combined competencies make humanoids versatile across diverse environments meant for humans, like homes and offices. They can fluidly adapt to new situations and tasks by leveraging understanding gleaned from the entirety of digitized written content they are trained on, much of which describes the human experience. So their ability to interpret and follow freeform instructions that leverage this broad world knowledge improves markedly.
One example is opening jars. Humans possess an innate understanding of how to grip various jars and what torque forces are needed to open them through lifelong experience. An AI trained on millions of wikiHow articles, books, and videos capturing techniques for opening different types of jars accrues substantial knowledge here. It can guide the precise hand and arm movements of an advanced humanoid, integrating rich information on grip, torque, friction, leverage etc. that may be prohibitively difficult to manually code for narrowly pre-programmed robots.
The promise of such human-robot collaboration is enticing but remains distant. While great progress is being made in each field, integrating them into one smoothly-functioning platform faces tremendous challenges. AI safety is already a pressing concern for today’s highly specialized models, so human-level reasoning poses major unsolved risks. And today’s robots still pale in mobility and dexterity compared to humans despite amazing acrobatic tricks.
Most experts believe human-level, let alone superhuman, AI remains decades away at least given the slew of fundamental barriers across reasoning, knowledge representation, creativity, planning and common sense. And robots still need substantial hardware improvements across sensing, actuation, materials, energy storage and computation to actualize true versatility. Thoughtful, step-wise integration of narrow systems to expand capabilities makes sense, but premature convergence could compound risks.
So while the lure of android-like devices that merge strengths of AI and robotics is understandable, hype needs tempering with wisdom. The most prudent path is to nurture both fields with care and ethics, align systems to humanity’s interests, and integrate them gradually to expand our problem-solving capacity while avoiding potential pitfalls. Setting expectations too far beyond near-term realities may breed public disillusionment, funding instability for long-term progress, and distraction from granular technical work needed to realize this grand vision responsibly. Our aims are uplifting, but our choices must be wise.