#Cross-Modal Alignment

4개의 포스트

[논문리뷰] AnalogRetriever: Learning Cross-Modal Representations for Analog Circuit Retrieval

본 논문은 아날로그 회로 설계 시 발생하는 이질적인 표현(Netlist, Schematic, Description) 간의 검색 어려움을 해결하고자 AnalogRetriever를 제안한다.

#Review #Analog Circuit Retrieval #Cross-Modal Alignment #SPICE Netlists #Relational Graph Convolutional Network (RGCN)#Retrieval-Augmented Generation (RAG)#Curriculum Contrastive Learning

2026년 5월 3일

[논문리뷰] Mario: Multimodal Graph Reasoning with Large Language Models

본 연구는 대규모 언어 모델(LLM)이 멀티모달 그래프(MMG)에서 추론할 때 발생하는 두 가지 주요 과제, 즉 교차 모달 불일치(cross-modal inconsistency) 및 이종 모달 선호도(heterogeneous modality preference) 를 해결하는 것을 목표로 합니다.

#Review #Multimodal Graph #Large Language Models #Graph Reasoning #Cross-Modal Alignment #Modality Adaptation #Instruction Tuning #Vision-Language Model #Node Classification

2026년 3월 8일

[논문리뷰] OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding

기존 옴니모달 대규모 언어 모델(OmniLLMs) 이 겪는 미세한 크로스모달 이해(fine-grained cross-modal understanding) 및 멀티모달 정렬(multimodal alignment) 의 한계를 해결하는 것을 목표로 합니다.

#Review #Omnimodal Understanding #Audio-Guided Perception #Active Learning Agents #Cross-Modal Alignment #Tool-Use #Video Understanding #Multimodal LLMs

2025년 12월 29일

[논문리뷰] Symbolic Graphics Programming with Large Language Models

본 논문은 대규모 언어 모델(LLMs)이 자연어 설명으로부터 정확한 시각적 콘텐츠를 렌더링하는 심볼릭 그래픽 프로그램(SGPs) , 특히 Scalable Vector Graphics (SVGs) 를 생성하는 능력을 탐구합니다.

#Review #Symbolic Graphics Programming #Large Language Models #Reinforcement Learning #SVG Generation #Text-to-Image Synthesis #Cross-Modal Alignment #Program Synthesis

2025년 9월 8일