#Neural Networks

8개의 포스트

[논문리뷰] Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models

본 논문은 현대 비전 임베딩 모델이 훈련 중 접하지 못한 개념 조합에 대해 합성적으로 일반화하기 위해 어떤 본질적인 표현 특성을 가져야 하는지 규명하는 것을 목표로 합니다.

#Review #Compositional Generalization #Vision-Language Models #Linear Representations #Orthogonal Representations #Neural Networks #Embedding Geometry #CLIP

2026년 3월 1일

[논문리뷰] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

기존 모션 제어 비디오 생성 모델의 낮은 제어 정밀도, 제한된 확장성 및 비실용적인 출력 품질 문제를 해결하고자 합니다.

#Review #Video Generation #Motion Control #Latent Trajectory Guidance #Image-to-Video #Diffusion Models #Neural Networks #MoveBench

2025년 12월 9일

[논문리뷰] Learning Eigenstructures of Unstructured Data Manifolds

이 논문은 비정형 데이터(unstructured data)로부터 연산자 선택, 이산화, 고유값 해석기 없이 직접 스펙트럼 기저(spectral basis)를 학습하는 새로운 프레임워크를 제안합니다.

#Review #Spectral Basis Learning #Unstructured Data #Manifold Learning #Laplacian Operator #Optimal Approximation Theory #Neural Networks #Eigenstructure #Point Cloud Processing

2025년 12월 1일

[논문리뷰] MIST: Mutual Information Via Supervised Training

본 논문은 고차원, 제한된 샘플, 복잡한 분포, 높은 MI(Mutual Information) 설정에서 기존 MI 추정기들이 겪는 성능 저하 문제를 해결하고자 합니다.

#Review #Mutual Information Estimation #Supervised Learning #Meta-Learning #Neural Networks #Uncertainty Quantification #SetTransformer #Quantile Regression

2025년 11월 24일

[논문리뷰] ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association

본 연구는 기존 모노큘러 덴스 SLAM 시스템의 주요 한계점인 카메라 인트린직스(intrinsics) 필요성, 높은 계산 복잡성, 그리고 장기적인 시퀀스에서의 드리프트 축적 문제를 해결하는 것을 목표로 합니다.

#Review #Monocular SLAM #Dense Reconstruction #Neural Networks #Pose Graph Optimization #Intrinsics-free #Real-time #Two-view Association

2025년 9월 3일

[논문리뷰] Learnable SMPLify: A Neural Solution for Optimization-Free Human Pose Inverse Kinematics

본 논문은 3D 인체 포즈 및 형태 추정에서 널리 사용되지만 계산 비용이 높은 SMPLify 의 반복적 최적화 과정을 데이터 기반 신경망 으로 대체하여, 최적화 없이 빠른 시간 내에 인버스 키네마틱스(IK) 문제를 해결하는 것을 목표로 합니다.

#Review #Inverse Kinematics #Human Pose Estimation #SMPL Model #Neural Networks #Optimization-Free #Residual Learning #Data-Driven

2025년 8월 25일

[논문리뷰] Trace Anything: Representing Any Video in 4D via Trajectory Fields

본 논문은 비디오의 동적 장면을 모델링하고 이해하는 데 필수적인 효과적인 시공간 표현 문제를 해결하고자 합니다.

#Review #4D Video Representation #Trajectory Fields #Neural Networks #Spatio-temporal Modeling #3D Point Tracking #Motion Forecasting #Computer Vision #B-splines

2025년 10월 16일

[논문리뷰] DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents

본 논문은 Multimodal Large Language Models (MLLMs)의 다중 작업 지도 미세 조정(SFT)에서 최적의 데이터 혼합 전략을 찾아 성능을 극대화하는 문제를 해결합니다. 특히, 모바일 폰 에이전트(MPA)의 다양한 기능을 동시에 처리하는 MLLM의 효율성을 향상시키는 것을 목표로 합니다.

#Review #Multimodal LLMs #Fine-tuning #Data Mixing Optimization #Mobile Phone Agents #Downstream Task Prediction #Benchmark #Neural Networks

2025년 10월 23일