본문으로 건너뛰기

#Image Understanding

11개의 포스트

[논문리뷰] Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models

댓글 수 로딩 중

[논문리뷰] UniCom: Unified Multimodal Modeling via Compressed Continuous Semantic Representations

댓글 수 로딩 중

[논문리뷰] OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

댓글 수 로딩 중

[논문리뷰] UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation

댓글 수 로딩 중

[논문리뷰] OneThinker: All-in-one Reasoning Model for Image and Video

댓글 수 로딩 중

[논문리뷰] Architecture Decoupling Is Not All You Need For Unified Multimodal Model

댓글 수 로딩 중

[논문리뷰] Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation

댓글 수 로딩 중

[논문리뷰] SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models

댓글 수 로딩 중

[논문리뷰] Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

댓글 수 로딩 중

[논문리뷰] Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer

댓글 수 로딩 중