#Masked Image Modeling

4개의 포스트

[논문리뷰] TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment

본 논문은 패치 수준의 증류(distillation) 과정이 정렬 능력을 크게 향상시킨다는 통찰을 바탕으로 TIPSv2 프레임워크를 제안한다. 저자들은 마스킹된 패치뿐만 아니라 모든 패치에 손실을 적용하는 iBOT++ 기법을 통해 학생 모델이 교사 모델의 표현을 더욱 강력하게 학습하도록 유도한다 .

#Review #Vision-Language Pretraining #Patch-Text Alignment #iBOT++#Masked Image Modeling #Distillation #Head-only EMA

2026년 4월 19일

[논문리뷰] The Collapse of Patches

Weidong Cai이 arXiv에 게시한 'The Collapse of Patches' 논문에 대한 자세한 리뷰입니다.

#Review #Patch Collapse #Image Generation #Image Classification #Masked Image Modeling #Vision Transformers #PageRank #Uncertainty Reduction #Computational Efficiency

2025년 11월 30일

[논문리뷰] Universal Image Restoration Pre-training via Masked Degradation Classification

arXiv에 게시된 'Universal Image Restoration Pre-training via Masked Degradation Classification' 논문에 대한 자세한 리뷰입니다.

#Review #Universal Image Restoration #Pre-training #Masked Image Modeling #Degradation Classification #Deep Learning #Computer Vision #Self-supervised Learning #Low-level Vision

2025년 10월 16일

[논문리뷰] Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation

Xihui Liu이 arXiv에 게시한 'Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation' 논문에 대한 자세한 리뷰입니다.

#Review #Autoregressive Models #Image Generation #Self-Supervised Learning #Visual Understanding #Masked Image Modeling #Contrastive Learning #Next-Token Prediction #LlamaGen

2025년 9월 19일