본문으로 건너뛰기

#Diffusion Models

357개의 포스트

[논문리뷰] HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images

댓글 수 로딩 중

[논문리뷰] Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

댓글 수 로딩 중

[논문리뷰] Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation

댓글 수 로딩 중

[논문리뷰] Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes

댓글 수 로딩 중

[논문리뷰] Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

댓글 수 로딩 중

[논문리뷰] ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement

댓글 수 로딩 중

[논문리뷰] H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos

댓글 수 로딩 중

[논문리뷰] UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios

댓글 수 로딩 중

[논문리뷰] SyncMV4D: Synchronized Multi-view Joint Diffusion of Appearance and Motion for Hand-Object Interaction Synthesis

댓글 수 로딩 중

[논문리뷰] CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving

댓글 수 로딩 중

[논문리뷰] Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation

댓글 수 로딩 중

[논문리뷰] LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

댓글 수 로딩 중

[논문리뷰] MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation

댓글 수 로딩 중