#Text-Guided Image Editing

3개의 포스트

[논문리뷰] Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes

본 논문은 기존 텍스트 기반 이미지 편집 모델이 객체 수준의 기하학적 변환(이동, 회전, 크기 조절)에 어려움을 겪는 문제를 해결하고자 합니다.

#Review #Reinforcement Learning #Text-Guided Image Editing #Object-Level Transformation #Geometric Transformation #Diffusion Models #GRPO #Scene Editing #Spatially Grounded Rewards

2026년 1월 5일

[논문리뷰] FlashEdit: Decoupling Speed, Structure, and Semantics for Precise Image Editing

이 논문은 확산 모델을 활용한 텍스트 기반 이미지 편집에서 발생하는 과도한 지연 시간, 배경 불안정성, 의미론적 얽힘 이라는 세 가지 주요 문제를 해결하는 것을 목표로 합니다. 연구의 궁극적인 목적은 속도와 품질 사이의 기존 트레이드오프를 극복하고 고품질의 실시간 이미지 편집 을 가능하게 하는 것입니다.

#Review #Text-Guided Image Editing #Diffusion Models #Real-Time Editing #One-Step Inversion #Attention Control #Background Preservation #Semantic Disentanglement

2025년 9월 29일

[논문리뷰] Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

본 논문은 대규모, 고품질, 공개적으로 접근 가능한 텍스트 기반 이미지 편집 데이터셋의 부족으로 인해 제한되었던 연구 발전을 해소하는 것을 목표로 합니다. 실제 이미지를 기반으로 한 포괄적이고 다양한 데이터셋을 제공하여 차세대 텍스트 기반 이미지 편집 모델의 훈련 및 벤치마킹을 위한 견고한 기반을 구축하고자 합니다.

#Review #Text-Guided Image Editing #Large-Scale Dataset #Multimodal Models #Dataset Curation #Quality Control #Prompt Engineering #Preference Learning #Multi-Turn Editing

2025년 10월 23일