본문으로 건너뛰기

#Reasoning

86개의 포스트

[논문리뷰] ETCHR: Editing To Clarify and Harness Reasoning

댓글 수 로딩 중

[논문리뷰] Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

댓글 수 로딩 중

[논문리뷰] Rethinking RL for LLM Reasoning: It's Sparse Policy Selection, Not Capability Learning

댓글 수 로딩 중

[논문리뷰] Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

댓글 수 로딩 중

[논문리뷰] ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement

댓글 수 로딩 중

[논문리뷰] TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

댓글 수 로딩 중

[논문리뷰] Think Anywhere in Code Generation

댓글 수 로딩 중

[논문리뷰] Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

댓글 수 로딩 중

[논문리뷰] DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning

댓글 수 로딩 중

[논문리뷰] Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

댓글 수 로딩 중

[논문리뷰] dVoting: Fast Voting for dLLMs

댓글 수 로딩 중

[논문리뷰] The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

댓글 수 로딩 중

[논문리뷰] Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

댓글 수 로딩 중

[논문리뷰] ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing

댓글 수 로딩 중

[논문리뷰] Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling

댓글 수 로딩 중

[논문리뷰] ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning

댓글 수 로딩 중

[논문리뷰] SkillFactory: Self-Distillation For Learning Cognitive Behaviors

댓글 수 로딩 중

[논문리뷰] PretrainZero: Reinforcement Active Pretraining

댓글 수 로딩 중

[논문리뷰] Xmodel-2.5: 1.3B Data-Efficient Reasoning SLM

댓글 수 로딩 중

[논문리뷰] MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism

댓글 수 로딩 중

[논문리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

댓글 수 로딩 중

[논문리뷰] VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

댓글 수 로딩 중

[논문리뷰] left|,circlearrowright,text{BUS},right|: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles

댓글 수 로딩 중

[논문리뷰] UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings

댓글 수 로딩 중

[논문리뷰] Variational Reasoning for Language Models

댓글 수 로딩 중

[논문리뷰] R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

댓글 수 로딩 중

[논문리뷰] TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

댓글 수 로딩 중

[논문리뷰] InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

댓글 수 로딩 중

[논문리뷰] On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

댓글 수 로딩 중

[논문리뷰] MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence

댓글 수 로딩 중

[논문리뷰] HumanSense: From Multimodal Perception to Empathetic Context-Aware Responses through Reasoning MLLMs

댓글 수 로딩 중

[논문리뷰] AMFT: Aligning LLM Reasoners by Meta-Learning the Optimal Imitation-Exploration Balance

댓글 수 로딩 중

[논문리뷰] GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

댓글 수 로딩 중

[논문리뷰] InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities

댓글 수 로딩 중

[논문리뷰] 3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding

댓글 수 로딩 중

[논문리뷰] FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning

댓글 수 로딩 중

[논문리뷰] PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs

댓글 수 로딩 중

[논문리뷰] MRMR: A Realistic and Expert-Level Multidisciplinary Benchmark for Reasoning-Intensive Multimodal Retrieval

댓글 수 로딩 중

[논문리뷰] Dyna-Mind: Learning to Simulate from Experience for Better AI Agents

댓글 수 로딩 중

[논문리뷰] First Try Matters: Revisiting the Role of Reflection in Reasoning Models

댓글 수 로딩 중

[논문리뷰] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation

댓글 수 로딩 중

[논문리뷰] PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies

댓글 수 로딩 중

[논문리뷰] Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap

댓글 수 로딩 중

[논문리뷰] More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models

댓글 수 로딩 중