Notice: Undefined index: OS in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/const.inc.php on line 64 Notice: Undefined variable: siters in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/function.inc.php on line 2414 Notice: Undefined index: User in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/const.inc.php on line 108 Notice: Undefined offset: 0 in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/function.inc.php on line 3607 Notice: Undefined offset: 0 in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/function.inc.php on line 3612 Notice: Undefined offset: 0 in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 70 Notice: Undefined offset: 0 in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 74 Notice: Undefined index: User in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 158 Notice: Undefined index: SID in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 177 Notice: Undefined index: UID in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 179 Notice: Undefined variable: UserName in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 180 Notice: Undefined variable: Mobile in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 181 Notice: Undefined variable: Email in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 182 Notice: Undefined variable: Num in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 183 Notice: Undefined variable: keyword in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 184 Notice: Undefined index: ac in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 189 Notice: Undefined index: CHtml in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 191 Notice: Undefined offset: 0 in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/common.php on line 201 Notice: Undefined index: t in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/info_view.php on line 40 Notice: Undefined offset: 0 in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/function.inc.php on line 3607 Notice: Undefined offset: 0 in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/function.inc.php on line 3612 Notice: Undefined variable: strimg in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/function.inc.php on line 3612 Notice: Undefined offset: 1 in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/function.inc.php on line 617 Notice: Undefined index: enseo in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/Include/function.inc.php on line 3076 Notice: Undefined variable: TPath in /usr/home/wh-as5ubll29rj6kxf8oxm/htdocs/pcen/info_view.php on line 125 The world model WHALE is here!-瞭望新时代网-瞭望时代,放眼世界

Sci-Tech

The world model WHALE is here!

2024-11-22   

Humans are able to imagine an imaginary world in their minds to predict different actions that may lead to different outcomes. Inspired by human intelligence, world models are designed to abstract the dynamics of the real world and provide predictions of what would happen if. Therefore, embodied agents can interact with world models instead of directly interacting with real-world environments to generate simulated data that can be used for various downstream tasks, including counterfactual prediction, offline strategy evaluation, and offline reinforcement learning. The world model plays a crucial role in decision-making in embodied environments, making costly exploration possible in the real world. In order to facilitate effective decision-making, world models must have strong generalization capabilities to support the imagination of out of distribution (OOD) regions and provide reliable uncertainty estimates to evaluate the credibility of simulated experiences, both of which pose significant challenges to previous scalable methods. Recently, researchers from Nanjing University, Nanqi Xiance and other institutions introduced WHALE (World models with behavior conditioning and retro Aging roll out learning) in their paper, which is a framework for learning generalizable world models composed of two key technologies that can be universally combined with any neural network architecture. On the basis of determining that policy distribution differences are the main source of generalization errors, researchers have introduced a behavior conditioning technique to enhance the generalization ability of the world model. This technique is based on the concept of policy conditional model learning and aims to enable the model to actively adapt to different behaviors to reduce extrapolation errors caused by distribution shifts. In addition, researchers have proposed a simple and effective technique called retraction rolling out for effective uncertainty estimation of model imagination. As a plug and play solution, it can be effectively applied to end effector pose control in various implementation tasks without any changes to the training process. By integrating these two technologies of WHALE, researchers have proposed WHALE-ST, a scalable spatiotemporal transformer based world model aimed at achieving more effective decision-making. The researchers further proposed WHALE-X, a pre trained 414M parameter world model on a 970K robot demonstration. Finally, the researchers conducted extensive experiments to demonstrate the excellent scalability and generalization of WHALE-ST and WHALE-X in simulated and real-world tasks, highlighting their effectiveness in enhancing decision-making. In order to evaluate the generalization ability of WHALE-X in actual physical environments, the research team conducted comprehensive experiments on the ARX5 robot. Unlike pre training data, the evaluation task adjusts camera angles and backgrounds, increasing the challenge to the world model. They collected a dataset of 60 trajectories for each task for fine-tuning, including unboxing, pushing, pitching, and moving bottles. They also designed multiple tasks that the model had never encountered before to test its visual, motion, and task generalization abilities. The results show that WHALE-X exhibits significant advantages in the real world: compared to models without behavior conditions, WHALEX's consistency has improved by 63%, indicating that this mechanism significantly enhances OOD generalization ability; WHALE-X, which conducted pre training on 970000 samples, has higher consistency than the model trained from scratch, highlighting the advantages of large-scale Internet data pre training; Increasing model parameters can improve the generalization ability of the world model, and the WHALE-X-base (203M) dynamic model has a consistency ratio three times higher than the 77M version in three unseen tasks. In addition, the quality and consistency of video generation results are consistent. The combination of behavior condition strategy, large-scale pre training dataset, and extended model parameters significantly improves the model's OOD generalization ability, especially in generating high-quality videos. (New Society)

Edit:Yao jue Responsible editor:Xie Tunan

Source:People's Posts and Telecommunications News

Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com

Recommended Reading Change it

Links