Embodied World Models for Decision Making

NeurIPS 2025 Workshop, San Diego

Dec 6 or 7 (Whole-Day Workshop)

Room: TBD




Schedule Speakers Call For Papers Organizers Contact

Overview

World models infer and predict real-world dynamics by modeling the external environment, and have become a cornerstone of embodied artificial intelligence. They have powered recent progress in decision-making and planning for interacting agents. This workshop aims to bring together researchers working at the intersection of generative modeling, reinforcement learning, computer vision, and robotics to explore the next generation of embodied world models—models that enable agents to understand, predict, and interact with the world through learned models. By focusing on embodiment and decision-making, this workshop seeks to advance world models beyond passive prediction, toward active, goal-driven interaction with the physical and virtual world. By emphasizing embodiment and decision-making, we aim to move beyond passive sequence prediction toward goal-directed interaction with both physical and simulated worlds.

Topics of Interest

We welcome contributions that advance theoretical foundations, algorithmic innovations, or real-world applications of world models. Topics of interest include (but are not limited to):
  • Model-based reinforcement learning and long-horizon planning. Investigating how world models can benefit model-based reinforcement learning with a focus on sample efficiency, performance, and scalability. Particular attention is given to long-horizon planning, which requires the agent to reason over extended sequences of actions, anticipate delayed outcomes, and maintain coherent strategies across temporally distant states and goals, often under uncertainty and limited feedback.
  • Aligning simulation and real-world physics for robot learning. Investigating how to bridge the gap between simulated and real-world physics to enhance robot learning. This includes using generative models to improve perception, planning, and control by capturing physical dynamics more accurately, modeling uncertainty and feedback effects, and learning diffusion-based policies that transfer robustly from simulation to the real world.
  • Interactive scene generation and downstream tasks. Building models that generate physically plausible and semantically coherent interactive video simulations. Focus areas include action-conditioned scene synthesis, controllable simulation of agent-environment dynamics, and the development of evaluation techniques and benchmarks that assess video fidelity, temporal consistency, and task-relevant controllability for downstream applications such as planning and policy learning.
  • Video-language-action (VLA) models and leveraging the world knowledge encoded in large language models (LLMs). Studying large-scale pretrained models that unify video, language, and action representations to support robust and generalizable policy learning. Core areas include curating diverse multi-modal datasets, improving cross-modal alignment, developing parameter-efficient fine-tuning methods, and enabling agents to follow complex, language-guided instructions in both simulated and real-world settings. We also explore how the structured and unstructured world knowledge embedded in large language models can be exploited to guide agents’ decision-making.
  • Applications in broader domains, such as open-world video games and autonomous driving. Extending world models to embodied agents in both real-world environments and high-fidelity simulators. Key topics include integrating perception with control, sim-to-real transfer, continual learning and adaptation, and deploying agents in open-ended tasks such as Minecraft, autonomous driving, and interactive real-world scenarios.

Call for Papers

We invite submissions of original research papers related to building physically plausible world models.

Submission Types:

  • Opinion Papers (max 4 pages) - For preliminary results, interesting applications, or novel ideas that did not pan out in practice.
  • Research Papers (4 to 9 pages) - For original research contributions.

Submission Guidelines:

  • Submit your paper via OpenReview
  • Please follow the style guidelines of NeurIPS 2025.
  • Papers are non-archival - we welcome submissions that have been submitted to or accepted by other venues.
  • Papers already accepted to NeurIPS 2025 will undergo an expedited review process primarily evaluating their relevance to the workshop themes.
  • All accepted papers will be presented in a poster session

Important Dates:

  • Submission Deadline: September 1, 2025 11:59PM UTC-0
  • Notification of Acceptance: September 15, 2025 UTC-0

Schedule

  • 09:00am - 09:15am Welcome Speech
  • 09:15am - 09:45am Invited Talk 1: (including 5 min Q&A)
  • 09:45am - 10:15am Invited Talk 2: (including 5 min Q&A)
  • 10:15am - 10:30am Invited Talk 3: (including 5 min Q&A)
  • 10:30am - 11:00am Oral Presentation
  • 11:00am - 11:30am Invited Talk 4: (including 5 min Q&A)
  • 11:30am - 12:00pm Invited Talk 5: (including 5 min Q&A)
  • 12:00pm - 01:00pm Lunch Break & Lunch Buddy
  • 01:00pm - 02:00pm Poster Session
  • 02:00pm - 02:30pm Invited Talk 6: (including 5 min Q&A)
  • 02:30pm - 03:00pm Invited Talk 7: (including 5 min Q&A)
  • 03:00pm - 03:15pm Coffee Break
  • 03:15pm - 03:45pm Invited Talk 8: (including 5 min Q&A)
  • 03:45pm - 04:15pm Invited Talk 9: (including 5 min Q&A)
  • 04:15pm - 05:15pm Panel Discussion
  • 05:15pm - 05:30pm Paper Award & Closing Remarks

Invited Speakers & Panelists

Chelsea Finn

Chelsea Finn

Stanford University & Physical Intelligence

Yilun Du

Yilun Du

Harvard

Xiaolong Wang

Xiaolong Wang

UC San Diego

David Hsu

David Hsu

National University of Singapore

Peter Stone

Peter Stone

University of Texas at Austin

John Langford

John Langford

Microsoft Research New York

Pablo Samuel Castro

Pablo Samuel Castro

DeepMind & Mila

Elias Bareinboim

Elias Bareinboim

Columbia University

Jiajun Wu

Jiajun Wu

Stanford University

Lin Shao

Lin Shao

National University of Singapore

Sanja Fidler

Sanja Fidler

University of Toronto & NVIDIA

Organizers

Qi Wang

Qi Wang

SJTU

Mengyue Yang

Mengyue Yang

University of Bristol

Huazhe Xu

Huazhe Xu

Tsinghua University

Xin Jin

Xin Jin

Eastern Institute of Technology, Ningbo

Nedko Savov

Nedko Savov

INSAIT, Sofia University

Guozheng Ma

Guozheng Ma

Nanyang Technological University

Bo Liu

Bo Liu

Meta FAIR

Yongquan ‘Owen’ Hu

Yongquan ‘Owen’ Hu

National University of Singapore

Jenny Zhang

Jenny Zhang

University of British Columbia

Nicklas Hansen

Nicklas Hansen

UC San Diego

Luc Van Gool

Luc Van Gool

INSAIT, Sofia University

Contact

For questions about the workshop, please contact us at:

worldmodels.workshop@gmail.com