#Textual Embeddings

1개의 포스트

[논문리뷰] Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings

대규모 비전-언어 모델(LVLM)이 시각적 정보를 불충분하게 활용하고 텍스트 우선(textual priors)에 과도하게 의존하여 발생하는 환각(hallucinations) 문제를 해결하는 것을 목표로 합니다. 이를 통해 모델의 시각적 grounding을 강화하고 더 균형 잡힌 멀티모달 추론을 촉진하고자 합니다.

#Review #Hallucination Mitigation #Large Vision-Language Models #Textual Embeddings #Multimodal Reasoning #Attention Mechanism #Visual Grounding #Modality Imbalance

2025년 11월 9일