본문으로 건너뛰기

#World Models

58개의 포스트

[논문리뷰] YoCausal: How Far is Video Generation from World Model? A Causality Perspective

댓글 수 로딩 중

[논문리뷰] Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation

댓글 수 로딩 중

[논문리뷰] Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

댓글 수 로딩 중

[논문리뷰] Cortex 2.0: Grounding World Models in Real-World Industrial Deployment

댓글 수 로딩 중

[논문리뷰] Neural Computers

댓글 수 로딩 중

[논문리뷰] Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models

댓글 수 로딩 중

[논문리뷰] FluidWorld: Reaction-Diffusion Dynamics as a Predictive Substrate for World Models

댓글 수 로딩 중

[논문리뷰] Reward Prediction with Factorized World States

댓글 수 로딩 중

[논문리뷰] WorldCache: Accelerating World Models for Free via Heterogeneous Token Caching

댓글 수 로딩 중

[논문리뷰] Chain of World: World Model Thinking in Latent Motion

댓글 수 로딩 중

[논문리뷰] Causal-JEPA: Learning World Models through Object-Level Latent Interventions

댓글 수 로딩 중

[논문리뷰] RISE: Self-Improving Robot Policy with Compositional World Model

댓글 수 로딩 중

[논문리뷰] WorldCompass: Reinforcement Learning for Long-Horizon World Models

댓글 수 로딩 중

[논문리뷰] OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

댓글 수 로딩 중

[논문리뷰] Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

댓글 수 로딩 중

[논문리뷰] Advancing Open-source World Models

댓글 수 로딩 중

[논문리뷰] Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

댓글 수 로딩 중

[논문리뷰] SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling

댓글 수 로딩 중

[논문리뷰] Act2Goal: From World Model To General Goal-conditioned Policy

댓글 수 로딩 중

[논문리뷰] The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text

댓글 수 로딩 중

[논문리뷰] Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

댓글 수 로딩 중

[논문리뷰] MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

댓글 수 로딩 중

[논문리뷰] UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

댓글 수 로딩 중

[논문리뷰] TV2TV: A Unified Framework for Interleaved Language and Video Generation

댓글 수 로딩 중

[논문리뷰] EgoLCD: Egocentric Video Generation with Long Context Diffusion

댓글 수 로딩 중

[논문리뷰] Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation

댓글 수 로딩 중

[논문리뷰] Target-Bench: Can World Models Achieve Mapless Path Planning with Semantic Targets?

댓글 수 로딩 중

[논문리뷰] SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models

댓글 수 로딩 중

[논문리뷰] 10 Open Challenges Steering the Future of Vision-Language-Action Models

댓글 수 로딩 중

[논문리뷰] Scaling Agent Learning via Experience Synthesis

댓글 수 로딩 중

[논문리뷰] How Far Are Surgeons from Surgical World Models? A Pilot Study on Zero-shot Surgical Video Generation with Expert Assessment

댓글 수 로딩 중

[논문리뷰] LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

댓글 수 로딩 중

[논문리뷰] PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] CoIRL-AD: Collaborative-Competitive Imitation-Reinforcement Learning in Latent World Models for Autonomous Driving

댓글 수 로딩 중

[논문리뷰] Dyna-Mind: Learning to Simulate from Experience for Better AI Agents

댓글 수 로딩 중

[논문리뷰] PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis

댓글 수 로딩 중

[논문리뷰] OmniNWM: Omniscient Driving Navigation World Models

댓글 수 로딩 중