[논문리뷰] MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face GenerationAjita Rattani이 arXiv에 게시한 'MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation' 논문에 대한 자세한 리뷰입니다.#Review#Diffusion Transformer#Multimodal Face Generation#Cross-Modal Fusion#RoPE Attention#Controlled Generation2026년 3월 31일댓글 수 로딩 중
[논문리뷰] MPJudge: Towards Perceptual Assessment of Music-Induced PaintingsarXiv에 게시된 'MPJudge: Towards Perceptual Assessment of Music-Induced Paintings' 논문에 대한 자세한 리뷰입니다.#Review#Music-Painting Cross-Modal#Perceptual Assessment#Modality-Adaptive Normalization#Direct Preference Optimization#Cross-Modal Fusion#Dataset Annotation#Affective Computing2025년 11월 10일댓글 수 로딩 중
[논문리뷰] IGGT: Instance-Grounded Geometry Transformer for Semantic 3D ReconstructionFangzhou Hong이 arXiv에 게시한 'IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction' 논문에 대한 자세한 리뷰입니다.#Review#Semantic 3D Reconstruction#Instance Grounding#Geometry Transformer#Multi-view Consistency#Scene Understanding#InsScene-15K#Vision-Language Models#Cross-Modal Fusion2025년 10월 28일댓글 수 로딩 중