본문으로 건너뛰기

#Supervised Fine-Tuning (SFT)

23개의 포스트

[논문리뷰] Unlocking Data Value in Finance: A Study on Distillation and Difficulty-Aware Training

댓글 수 로딩 중

[논문리뷰] Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction

댓글 수 로딩 중

[논문리뷰] Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

댓글 수 로딩 중

[논문리뷰] Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling

댓글 수 로딩 중

[논문리뷰] Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

댓글 수 로딩 중

[논문리뷰] Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection

댓글 수 로딩 중

[논문리뷰] Revisiting the Necessity of Lengthy Chain-of-Thought in Vision-centric Reasoning Generalization

댓글 수 로딩 중

[논문리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

댓글 수 로딩 중

[논문리뷰] Value Drifts: Tracing Value Alignment During LLM Post-Training

댓글 수 로딩 중

[논문리뷰] ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning

댓글 수 로딩 중

[논문리뷰] Logics-Parsing Technical Report

댓글 수 로딩 중

[논문리뷰] Improving Context Fidelity via Native Retrieval-Augmented Reasoning

댓글 수 로딩 중

[논문리뷰] Towards a Unified View of Large Language Model Post-Training

댓글 수 로딩 중

[논문리뷰] Are Today's LLMs Ready to Explain Well-Being Concepts?

댓글 수 로딩 중

[논문리뷰] Apriel-1.5-15b-Thinker

댓글 수 로딩 중

[논문리뷰] UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning

댓글 수 로딩 중

[논문리뷰] Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense

댓글 수 로딩 중

[논문리뷰] Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training

댓글 수 로딩 중