AI and Robotics Advancements in 2024: A Comprehensive Overview

2024 is shaping up to be a pivotal year for advancements in artificial intelligence (AI) and robotics. Major tech companies and research institutions have made significant strides in these fields, introducing innovative technologies that promise to revolutionize various sectors. This article provides a comprehensive overview of these developments.

Figure’s Autonomous Learning

Figure, a leading robotics company, has made a significant breakthrough with its Figure-01 AI. This system can learn to self-correct tasks by observing humans perform them. For instance, it can learn to make coffee by watching humans do so, marking a significant step towards end-to-end AI training.

Figure-01 has learned to make coffee ☕️

Our AI learned this after watching humans make coffee

This is end-to-end AI: our neural networks are taking video in, trajectories out

Join us to train our robot fleet: https://t.co/egQy3iz3Ky pic.twitter.com/Y0ksEoHZsW
— Brett Adcock (@adcock_brett) January 7, 2024

Google DeepMind’s Mobile Aloha and Advanced Robotics Research

Google DeepMind, in collaboration with Stanford researchers, introduced Mobile Aloha, an open-source robotic system capable of completing complex tasks such as cooking and cleaning. This system is a testament to the potential of accessible and reproducible solutions for bimanual mobile manipulation.

Introducing 𝐌𝐨𝐛𝐢𝐥𝐞 𝐀𝐋𝐎𝐇𝐀🏄 — Hardware!
A low-cost, open-source, mobile manipulator.

One of the most high-effort projects in my past 5yrs! Not possible without co-lead @zipengfu and @chelseabfinn.

At the end, what's better than cooking yourself a meal with the 🤖🧑‍🍳 pic.twitter.com/iNBIY1tkcB
— Tony Z. Zhao (@tonyzzhao) January 3, 2024

In addition to Mobile Aloha, Google DeepMind has made significant strides in robotics research, introducing AutoRT for data collection, SARA-RT for faster transformers, and RT-Trajectory for better motion generalization. These advancements promise a future where robots can perform a wide array of tasks, demonstrating strides towards more capable helper robots.

How could robotics soon help us in our daily lives? 🤖

Today, we’re announcing a suite of research advances that enable robots to make decisions faster as well as better understand and navigate their environments.

Here's a snapshot of the work. 🧵 https://t.co/rqOnzDDMDI pic.twitter.com/satbbGyltI
— Google DeepMind (@GoogleDeepMind) January 4, 2024

DeWave: Translating Thoughts into Text

Researchers have developed DeWave, an AI system capable of turning silent thoughts into text by decoding brain signals. Remarkably, this system achieved over 40% accuracy in translating verbs directly from neural signals, without the need for invasive implants.

🚨 NEW DeWave AI can read your brain signals and translate them into text 🤯 pic.twitter.com/SYRUF08qTL
— MachineAlpha ⭕️ (@Machine4lpha) January 3, 2024

Apple’s Ferret and Rumored Siri Upgrade

Apple researchers have revealed ‘Ferret’, an open-source multimodal large language model (LLM) that can use regions of images for queries. Coupled with rumors of an upgraded version of Siri, Apple is poised to make significant strides in the AI race.

Alibaba’s Lifelike AI Avatar-Maker

Alibaba researchers have released a lifelike AI avatar-maker called ‘Mach’. This AI system can turn text prompts into realistic 3D avatars, using LLM and vision models.

AI boy- and girlfriends will soon become a reality!

MACH is able to generate fully-realized 3D characters with detailed facial features, hair, and clothing within 2 minutes which are highly-realistic and animatable.https://t.co/t1vYk4GXTc pic.twitter.com/RP7d91TxUJ
— Dreaming Tulpa 🥓👑 (@dreamingtulpa) January 7, 2024

UCLA and Snap’s Dual-Pivot Tuning

Researchers from UCLA and Snap have presented dual-pivot tuning, a new AI approach that leverages personal photos to customize image restoration models, better preserving individual facial features. This represents a significant advancement over existing generic techniques.

JPMorgan’s DocLLM

JPMorgan’s research team has introduced an AI model called DocLLM, designed to understand complex business documents. Impressively, DocLLM outperformed leading models, including GPT-4, by over 15% on some form analysis challenges.

OpenAI’s GPT Store

OpenAI has announced the upcoming launch of the GPT store, a new distribution platform for builders. The specifics of the revenue-sharing split are yet to be clarified.

Microsoft’s Copilot Apps and Keyboard Key

Microsoft has introduced new Android, iOS, and iPadOS apps for Copilot, along with a new Copilot keyboard key coming to new devices. GPT-4, DALL-E 3, Voice Chat, and Vision are now all available for free without the need for ChatGPT+.

Addressing Biases in LLMs

New research has revealed that LLMs perform better when prompted to act as gender-neutral or male rather than female. This underscores the importance of addressing biases that can creep into machine learning models’ training data.

In conclusion, 2024 is set to be a landmark year for AI and robotics, with major advancements expected across various sectors. These developments promise to revolutionize how we interact with technology, making our lives easier and more efficient.