A group of humanoid robots are on display at the entrance of an exhibition of the three-day World Artificial Intelligence Conference (WAIC) 2024 in Shanghai, east China, July 4, 2024. (Xinhua/Fang Zhe)
BEIJING, Aug. 29 (Xinhua) -- At the recently concluded World Robot Conference 2024 in Beijing, humanoid robots, celebrated for their unmatched adaptability to real-world scenarios, stole the spotlight. An impressive array of 27 robots took to the stage, not only setting a new benchmark in sheer numbers but also in the level of sophistication.
This year's conference marks a significant departure from previous ones. Humanoid robots, "catalyzed" by large models, are undergoing a systemic evolution that transcends mere partial improvements to achieve holistic improvements.
-- From isolated dexterity to systematic coordination
Say to a humanoid robot, "I'd like a latte". In no time, a freshly brewed cup of aromatic coffee is presented to you.
When you tell another humanoid robot "Galbot", "Galbot, please get me an umbrella", the robot extends its arms and springs into action immediately. It scans the assortment of items and skillfully coordinates its eyes and hands to pick out the umbrella. This impressive performance draws resounding applause from the audience…
At this year's conference, the competition among humanoid robots has transcended individual technological innovations and the comparison of specific "organ" specifications. The focus has "evolved" toward the integration and coordination between various "organs".
While grasping and organizing might seem simple, these two actions actually involve multiple core technologies, introduced by Yao Tengzhou, co-founder of Beijing Galbot Co., Ltd..
Today, the "refined" hand capabilities of humanoid robots have surpassed the dexterity of an "organ", evolving towards the iteration of local coordinated actions.
Equipping a robot with both tactile and visual-tactile perception allows it to precisely detect the location of forces and control their intensity, said Peng Zhihui, co-founder and CTO of Agibot.
With the use of a stereo vision system, we have created the hand-eye serve system, control system, and motion system, bringing genuine eye-hand coordination to humanoid robots, said Dong Xiaojian, founder of Beijing Vizum Technology Co., Ltd.
-- From AI as "decorative touch" to AI as driving force
The fast-tracked evolution of humanoid robots owes much to AI, the "driver".
According to Yang Fengyu, founder and CEO of UniX AI, the profound integration between humanoid robots and artificial intelligence marks a significant trend in the robotics industry this year.
In the past, robots were confined to executing single tasks in a set environment due to their lack of autonomous motion control, said Xiong Youjun, General Manager of Beijing Embodied Artificial Intelligence Robotics Innovation Center.
Nowadays, the continuous evolution of humanoid robots is powered by AI-driven upgrades in both their "brain" and "cerebellum".
Ji Chao, Chief Scientist of iFLYTEK humanoid robot division, cited an example where the Spark large model has markedly elevated the intelligence level of humanoid robots in terms of complex task decomposition, object recognition in open scenarios, and multimodal perception and understanding.
The causal reasoning capabilities of large models have greatly enhanced robots' understanding of complex tasks, and enabled them to break down and plan tasks that align with common sense in the physical world, said Ji.
The combination of embodied perception models and embodied decision models has notably advanced the robots' multimodal perception and understanding in real-world scenarios .
Another significant function of large models is seen in the enhanced "cerebellum" of humanoid robots. This indicates that algorithms drive the motion control of humanoid robots, and elevate their flexibility and coordination,said Xiong.
Since the start of this year, artificial intelligence has deeply permeated every stage of the evolution of humanoid robots.
Firstly, the perception system has evolved from basic environmental perception to sophisticated multimodal perception. Secondly, in motion control, robots have evolved from standing and walking to jumping and running, and from basic grasping to performing complex, delicate hand tasks. Thirdly, as for intelligent decision-making, there has been a shift from pre-defined behaviors to autonomous learning and decision-making. Lastly, in terms of interaction capabilities, robots have evolved from merely executing commands to understanding natural language and even recognizing emotion, according to Yang.
-- From product iteration to fast-tracked mass production
Recently, as several domestic humanoid robot companies have made significant technological breakthroughs, launched new products, and refreshed application scenarios, humanoid robots have been rapidly shifting from small-scale shipments to mass production.
Just a few days ago, Agibot unveiled its mass production schedule.
As the first humanoid robot factory in Shanghai, Agibot has finalized the construction of its production lines and completed the staffing for its Phase I factory. Production is scheduled to begin this October, with a planned monthly output exceeding 100 units and an anticipated total shipment of around 300 units for the year.
With fast-tracked mass production becoming a central focus for many companies, the industry is swiftly exploring new scenarios and pushing forward the integration of large models with embodied intelligence.
There are ongoing efforts to enrich the industry ecosystem through open-source initiatives and to strengthen cost control.
For example, the Beijing Embodied Artificial Intelligence Robotics Innovation Center is drawing in talent from across the globe to address global challenges in key, common technologies for humanoid robots.
iFLYTEK is forging deep connections with 420 robotics companies and 15,000 developers through its "Robotic Super brain Platform". The company is also partnering with companies and institutions like UBTECH, Agibot and Galbotto explore the integrated application of multimodal interaction solutions, jointly driving the commercial implementation through technological iterations. (Edited by Yang Yifan with Xinhua Silk Road, yangyifan@xinhua.org)