본문으로 건너뛰기

#Image Editing

81개의 포스트

[논문리뷰] From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing

댓글 수 로딩 중

[논문리뷰] Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling

댓글 수 로딩 중

[논문리뷰] RewardFlow: Generate Images by Optimizing What You Reward

댓글 수 로딩 중

[논문리뷰] FlowSlider: Training-Free Continuous Image Editing via Fidelity-Steering Decomposition

댓글 수 로딩 중

[논문리뷰] ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

댓글 수 로딩 중

[논문리뷰] GEditBench v2: A Human-Aligned Benchmark for General Image Editing

댓글 수 로딩 중

[논문리뷰] Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation

댓글 수 로딩 중

[논문리뷰] UniCom: Unified Multimodal Modeling via Compressed Continuous Semantic Representations

댓글 수 로딩 중

[논문리뷰] InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

댓글 수 로딩 중

[논문리뷰] CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing

댓글 수 로딩 중

[논문리뷰] From Scale to Speed: Adaptive Test-Time Scaling for Image Editing

댓글 수 로딩 중

[논문리뷰] DLEBench: Evaluating Small-scale Object Editing Ability for Instruction-based Image Editing Model

댓글 수 로딩 중

[논문리뷰] Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models

댓글 수 로딩 중

[논문리뷰] FireRed-Image-Edit-1.0 Techinical Report

댓글 수 로딩 중

[논문리뷰] DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

댓글 수 로딩 중

[논문리뷰] Rethinking Global Text Conditioning in Diffusion Transformers

댓글 수 로딩 중

[논문리뷰] How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing

댓글 수 로딩 중

[논문리뷰] Rethinking Composed Image Retrieval Evaluation: A Fine-Grained Benchmark from Image Editing

댓글 수 로딩 중

[논문리뷰] Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing

댓글 수 로딩 중

[논문리뷰] ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing

댓글 수 로딩 중

[논문리뷰] VINO: A Unified Visual Generator with Interleaved OmniModal Context

댓글 수 로딩 중

[논문리뷰] DreamOmni3: Scribble-based Editing and Generation

댓글 수 로딩 중

[논문리뷰] SpotEdit: Selective Region Editing in Diffusion Transformers

댓글 수 로딩 중

[논문리뷰] Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

댓글 수 로딩 중

[논문리뷰] RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing

댓글 수 로딩 중

[논문리뷰] Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models

댓글 수 로딩 중

[논문리뷰] Exploring MLLM-Diffusion Information Transfer with MetaCanvas

댓글 수 로딩 중

[논문리뷰] LongCat-Image Technical Report

댓글 수 로딩 중

[논문리뷰] EditThinker: Unlocking Iterative Reasoning for Any Image Editor

댓글 수 로딩 중

[논문리뷰] The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment

댓글 수 로딩 중

[논문리뷰] TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

댓글 수 로딩 중

[논문리뷰] MIRA: Multimodal Iterative Reasoning Agent for Image Editing

댓글 수 로딩 중

[논문리뷰] DiffSeg30k: A Multi-Turn Diffusion Editing Benchmark for Localized AIGC Detection

댓글 수 로딩 중

[논문리뷰] Controllable Layer Decomposition for Reversible Multi-Layer Image Generation

댓글 수 로딩 중

[논문리뷰] UniREditBench: A Unified Reasoning-based Image Editing Benchmark

댓글 수 로딩 중

[논문리뷰] OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

댓글 수 로딩 중

[논문리뷰] EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling

댓글 수 로딩 중

[논문리뷰] Seedream 4.0: Toward Next-generation Multimodal Image Generation

댓글 수 로딩 중

[논문리뷰] Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation

댓글 수 로딩 중

[논문리뷰] Hyper-Bagel: A Unified Acceleration Framework for Multimodal Understanding and Generation

댓글 수 로딩 중

[논문리뷰] LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

댓글 수 로딩 중

[논문리뷰] Reconstruction Alignment Improves Unified Multimodal Models

댓글 수 로딩 중

[논문리뷰] From Editor to Dense Geometry Estimator

댓글 수 로딩 중

[논문리뷰] Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing

댓글 수 로딩 중

[논문리뷰] Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control

댓글 수 로딩 중

[논문리뷰] Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation

댓글 수 로딩 중

[논문리뷰] Learning an Image Editing Model without Image Editing Pairs

댓글 수 로딩 중

[논문리뷰] Factuality Matters: When Image Generation and Editing Meet Structured Visuals

댓글 수 로딩 중

[논문리뷰] ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation

댓글 수 로딩 중

[논문리뷰] PICABench: How Far Are We from Physically Realistic Image Editing?

댓글 수 로딩 중

[논문리뷰] ConsistEdit: Highly Consistent and Precise Training-free Visual Editing

댓글 수 로딩 중

[논문리뷰] BLIP3o-NEXT: Next Frontier of Native Image Generation

댓글 수 로딩 중