본문으로 건너뛰기

#Flow Matching

87개의 포스트

[논문리뷰] EVA01: Unified Native 3D Understanding and Generation via Mixture-of-Transformers

댓글 수 로딩 중

[논문리뷰] SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

댓글 수 로딩 중

[논문리뷰] KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration

댓글 수 로딩 중

[논문리뷰] PRISM: Prior Rectification and Uncertainty-Aware Structure Modeling for Diffusion-Based Text Image Super-Resolution

댓글 수 로딩 중

[논문리뷰] Steering Visual Generation in Unified Multimodal Models with Understanding Supervision

댓글 수 로딩 중

[논문리뷰] Trees to Flows and Back: Unifying Decision Trees and Diffusion Models

댓글 수 로딩 중

[논문리뷰] ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis

댓글 수 로딩 중

[논문리뷰] Cortex 2.0: Grounding World Models in Real-World Industrial Deployment

댓글 수 로딩 중

[논문리뷰] LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories

댓글 수 로딩 중

[논문리뷰] RewardFlow: Generate Images by Optimizing What You Reward

댓글 수 로딩 중

[논문리뷰] Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics

댓글 수 로딩 중

[논문리뷰] Woosh: A Sound Effects Foundation Model

댓글 수 로딩 중

[논문리뷰] UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation

댓글 수 로딩 중

[논문리뷰] WiT: Waypoint Diffusion Transformers via Trajectory Conflict Navigation

댓글 수 로딩 중

[논문리뷰] Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models

댓글 수 로딩 중

[논문리뷰] Communication-Inspired Tokenization for Structured Image Representations

댓글 수 로딩 중

[논문리뷰] SARAH: Spatially Aware Real-time Agentic Humans

댓글 수 로딩 중

[논문리뷰] Xiaomi-Robotics-0: An Open-Sourced Vision-Language-Action Model with Real-Time Execution

댓글 수 로딩 중

[논문리뷰] FLAC: Maximum Entropy RL via Kinetic Energy Regularized Bridge Matching

댓글 수 로딩 중

[논문리뷰] Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO

댓글 수 로딩 중

[논문리뷰] Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis

댓글 수 로딩 중

[논문리뷰] DINO-SAE: DINO Spherical Autoencoder for High-Fidelity Image Reconstruction and Generation

댓글 수 로딩 중

[논문리뷰] SAM Audio: Segment Anything in Audio

댓글 수 로딩 중

[논문리뷰] Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge

댓글 수 로딩 중

[논문리뷰] SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

댓글 수 로딩 중

[논문리뷰] TV2TV: A Unified Framework for Interleaved Language and Video Generation

댓글 수 로딩 중

[논문리뷰] Generative Neural Video Compression via Video Diffusion Prior

댓글 수 로딩 중

[논문리뷰] DiG-Flow: Discrepancy-Guided Flow Matching for Robust VLA Models

댓글 수 로딩 중

[논문리뷰] TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

댓글 수 로딩 중

[논문리뷰] DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

댓글 수 로딩 중

[논문리뷰] Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

댓글 수 로딩 중

[논문리뷰] EVTAR: End-to-End Try on with Additional Unpaired Visual Reference

댓글 수 로딩 중

[논문리뷰] UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback

댓글 수 로딩 중

[논문리뷰] π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

댓글 수 로딩 중

[논문리뷰] CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching

댓글 수 로딩 중

[논문리뷰] Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification

댓글 수 로딩 중

[논문리뷰] From Editor to Dense Geometry Estimator

댓글 수 로딩 중

[논문리뷰] EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

댓글 수 로딩 중

[논문리뷰] OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning

댓글 수 로딩 중

[논문리뷰] TempFlow-GRPO: When Timing Matters for GRPO in Flow Models

댓글 수 로딩 중

[논문리뷰] SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering

댓글 수 로딩 중

[논문리뷰] MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency

댓글 수 로딩 중

[논문리뷰] EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation

댓글 수 로딩 중

[논문리뷰] The Principles of Diffusion Models

댓글 수 로딩 중

[논문리뷰] Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation

댓글 수 로딩 중

[논문리뷰] ACG: Action Coherence Guidance for Flow-based VLA models

댓글 수 로딩 중

[논문리뷰] Equilibrium Matching: Generative Modeling with Implicit Energy-Based Models

댓글 수 로딩 중

[논문리뷰] Deforming Videos to Masks: Flow Matching for Referring Video Segmentation

댓글 수 로딩 중

[논문리뷰] Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

댓글 수 로딩 중

[논문리뷰] AlphaFlow: Understanding and Improving MeanFlow Models

댓글 수 로딩 중