본문으로 건너뛰기

#Preference Optimization

23개의 포스트

[논문리뷰] WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training

댓글 수 로딩 중

[논문리뷰] DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents

댓글 수 로딩 중

[논문리뷰] YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation

댓글 수 로딩 중

[논문리뷰] Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

댓글 수 로딩 중

[논문리뷰] DZ-TDPO: Non-Destructive Temporal Alignment for Mutable State Tracking in Long-Context Dialogue

댓글 수 로딩 중

[논문리뷰] From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models

댓글 수 로딩 중

[논문리뷰] Value Drifts: Tracing Value Alignment During LLM Post-Training

댓글 수 로딩 중

[논문리뷰] OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models

댓글 수 로딩 중

[논문리뷰] DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

댓글 수 로딩 중

[논문리뷰] FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation

댓글 수 로딩 중

[논문리뷰] TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs

댓글 수 로딩 중

[논문리뷰] MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models

댓글 수 로딩 중