본문으로 건너뛰기

#Review

3409개의 포스트

[논문리뷰] Watch Before You Answer: Learning from Visually Grounded Post-Training

댓글 수 로딩 중

[논문리뷰] Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

댓글 수 로딩 중

[논문리뷰] Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

댓글 수 로딩 중

[논문리뷰] ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement

댓글 수 로딩 중

[논문리뷰] Paper Circle: An Open-source Multi-agent Research Discovery and Analysis Framework

댓글 수 로딩 중

[논문리뷰] MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

댓글 수 로딩 중

[논문리뷰] MedGemma 1.5 Technical Report

댓글 수 로딩 중

[논문리뷰] Experience Transfer for Multimodal LLM Agents in Minecraft Game

댓글 수 로딩 중

[논문리뷰] DARE: Diffusion Large Language Models Alignment and Reinforcement Executor

댓글 수 로딩 중

[논문리뷰] Can Natural Image Autoencoders Compactly Tokenize fMRI Volumes for Long-Range Dynamics Modeling?

댓글 수 로딩 중

[논문리뷰] Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning

댓글 수 로딩 중

[논문리뷰] Action Images: End-to-End Policy Learning via Multiview Video Generation

댓글 수 로딩 중

[논문리뷰] Vero: An Open RL Recipe for General Visual Reasoning

댓글 수 로딩 중

[논문리뷰] Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

댓글 수 로딩 중

[논문리뷰] The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models

댓글 수 로딩 중

[논문리뷰] SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing

댓글 수 로딩 중

[논문리뷰] SkillX: Automatically Constructing Skill Knowledge Bases for Agents

댓글 수 로딩 중

[논문리뷰] MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

댓글 수 로딩 중

[논문리뷰] LightThinker++: From Reasoning Compression to Memory Management

댓글 수 로딩 중

[논문리뷰] LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models

댓글 수 로딩 중

[논문리뷰] ClawArena: Benchmarking AI Agents in Evolving Information Environments

댓글 수 로딩 중

[논문리뷰] Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation

댓글 수 로딩 중

[논문리뷰] Token Warping Helps MLLMs Look from Nearby Viewpoints

댓글 수 로딩 중

[논문리뷰] AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks

댓글 수 로딩 중

[논문리뷰] AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

댓글 수 로딩 중

[논문리뷰] A Simple Baseline for Streaming Video Understanding

댓글 수 로딩 중

[논문리뷰] VOID: Video Object and Interaction Deletion

댓글 수 로딩 중

[논문리뷰] Tex3D: Objects as Attack Surfaces via Adversarial 3D Textures for Vision-Language-Action Models

댓글 수 로딩 중

[논문리뷰] T5Gemma-TTS Technical Report

댓글 수 로딩 중

[논문리뷰] Steerable Visual Representations

댓글 수 로딩 중

[논문리뷰] SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

댓글 수 로딩 중

[논문리뷰] Omni123: Exploring 3D Native Foundation Models with Limited 3D Data by Unifying Text to 2D and 3D Generation

댓글 수 로딩 중

[논문리뷰] LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model

댓글 수 로딩 중

[논문리뷰] Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers

댓글 수 로딩 중

[논문리뷰] Friends and Grandmothers in Silico: Localizing Entity Cells in Language Models

댓글 수 로딩 중

[논문리뷰] DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

댓글 수 로딩 중

[논문리뷰] Automatic Image-Level Morphological Trait Annotation for Organismal Images

댓글 수 로딩 중

[논문리뷰] AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation

댓글 수 로딩 중

[논문리뷰] ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

댓글 수 로딩 중

[논문리뷰] Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding

댓글 수 로딩 중

[논문리뷰] Terminal Agents Suffice for Enterprise Automation

댓글 수 로딩 중

[논문리뷰] Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines

댓글 수 로딩 중

[논문리뷰] Reasoning Shift: How Context Silently Shortens LLM Reasoning

댓글 수 로딩 중

[논문리뷰] MemRerank: Preference Memory for Personalized Product Reranking

댓글 수 로딩 중

[논문리뷰] HippoCamp: Benchmarking Contextual Agents on Personal Computers

댓글 수 로딩 중

[논문리뷰] AI Generalisation Gap In Comorbid Sleep Disorder Staging

댓글 수 로딩 중

[논문리뷰] VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions

댓글 수 로딩 중

[논문리뷰] Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

댓글 수 로딩 중

[논문리뷰] VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding

댓글 수 로딩 중

[논문리뷰] RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models

댓글 수 로딩 중

[논문리뷰] LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

댓글 수 로딩 중

[논문리뷰] Think While Watching: Online Streaming Segment-Level Memory for Multi-Turn Video Reasoning in Multimodal Large Language Models

댓글 수 로딩 중

[논문리뷰] HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios

댓글 수 로딩 중

[논문리뷰] Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

댓글 수 로딩 중

[논문리뷰] MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants

댓글 수 로딩 중

[논문리뷰] Skip to the Good Part: Representation Structure & Inference-Time Layer Skipping in Diffusion vs. Autoregressive LLMs

댓글 수 로딩 중

[논문리뷰] STMI: Segmentation-Guided Token Modulation with Cross-Modal Hypergraph Interaction for Multi-Modal Object Re-Identification

댓글 수 로딩 중

[논문리뷰] HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images

댓글 수 로딩 중

[논문리뷰] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

댓글 수 로딩 중

[논문리뷰] Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training

댓글 수 로딩 중

[논문리뷰] AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games

댓글 수 로딩 중

[논문리뷰] NoLan: Mitigating Object Hallucinations in Large Vision-Language Models via Dynamic Suppression of Language Priors

댓글 수 로딩 중

[논문리뷰] Model Context Protocol (MCP) Tool Descriptions Are Smelly! Towards Improving AI Agent Efficiency with Augmented MCP Tool Descriptions

댓글 수 로딩 중

[논문리뷰] FlowPrefill: Decoupling Preemption from Prefill Scheduling Granularity to Mitigate Head-of-Line Blocking in LLM Serving

댓글 수 로딩 중

[논문리뷰] Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation

댓글 수 로딩 중

[논문리뷰] ManCAR: Manifold-Constrained Latent Reasoning with Adaptive Test-Time Computation for Sequential Recommendation

댓글 수 로딩 중

[논문리뷰] Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

댓글 수 로딩 중

[논문리뷰] EgoPush: Learning End-to-End Egocentric Multi-Object Rearrangement for Mobile Robots

댓글 수 로딩 중

[논문리뷰] InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem

댓글 수 로딩 중

[논문리뷰] Acoustivision Pro: An Open-Source Interactive Platform for Room Impulse Response Analysis and Acoustic Characterization

댓글 수 로딩 중

[논문리뷰] Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning

댓글 수 로딩 중

[논문리뷰] Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

댓글 수 로딩 중

[논문리뷰] Back to Basics: Revisiting Exploration in Reinforcement Learning for LLM Reasoning via Generative Probabilities

댓글 수 로딩 중

[논문리뷰] Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR

댓글 수 로딩 중

[논문리뷰] WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models

댓글 수 로딩 중

[논문리뷰] BatCoder: Self-Supervised Bidirectional Code-Documentation Learning via Back-Translation

댓글 수 로딩 중

[논문리뷰] AdaptMMBench: Benchmarking Adaptive Multimodal Reasoning for Mode Selection and Reasoning Process

댓글 수 로딩 중

[논문리뷰] How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing

댓글 수 로딩 중

[논문리뷰] Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation

댓글 수 로딩 중

[논문리뷰] FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation

댓글 수 로딩 중

[논문리뷰] Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening

댓글 수 로딩 중

[논문리뷰] MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models

댓글 수 로딩 중

[논문리뷰] SketchDynamics: Exploring Free-Form Sketches for Dynamic Intent Expression in Animation Generation

댓글 수 로딩 중

[논문리뷰] HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences

댓글 수 로딩 중

[논문리뷰] Less Is More -- Until It Breaks: Security Pitfalls of Vision Token Compression in Large Vision-Language Models

댓글 수 로딩 중

[논문리뷰] TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers

댓글 수 로딩 중

[논문리뷰] Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification

댓글 수 로딩 중

[논문리뷰] Facilitating Proactive and Reactive Guidance for Decision Making on the Web: A Design Probe with WebSeek

댓글 수 로딩 중

[논문리뷰] A Hybrid Protocol for Large-Scale Semantic Dataset Generation in Low-Resource Languages: The Turkish Semantic Relations Corpus

댓글 수 로딩 중

[논문리뷰] A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

댓글 수 로딩 중

[논문리뷰] SkinFlow: Efficient Information Transmission for Open Dermatological Diagnosis via Dynamic Visual Encoding and Staged RL

댓글 수 로딩 중

[논문리뷰] OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene Understanding

댓글 수 로딩 중

[논문리뷰] Are LLMs Vulnerable to Preference-Undermining Attacks (PUA)? A Factorial Analysis Methodology for Diagnosing the Trade-off between Preference Alignment and Real-World Validity

댓글 수 로딩 중

[논문리뷰] Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction

댓글 수 로딩 중

[논문리뷰] Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

댓글 수 로딩 중

[논문리뷰] RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes

댓글 수 로딩 중

[논문리뷰] EpiQAL: Benchmarking Large Language Models in Epidemiological Question Answering for Enhanced Alignment and Reasoning

댓글 수 로딩 중

[논문리뷰] X-MuTeST: A Multilingual Benchmark for Explainable Hate Speech Detection and A Novel LLM-consulted Explanation Framework

댓글 수 로딩 중

[논문리뷰] CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving

댓글 수 로딩 중

[논문리뷰] Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes

댓글 수 로딩 중

[논문리뷰] Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

댓글 수 로딩 중

[논문리뷰] Geometry-Aware Optimization for Respiratory Sound Classification: Enhancing Sensitivity with SAM-Optimized Audio Spectrogram Transformers

댓글 수 로딩 중

[논문리뷰] Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

댓글 수 로딩 중

[논문리뷰] Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

댓글 수 로딩 중

[논문리뷰] UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

댓글 수 로딩 중

[논문리뷰] Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models

댓글 수 로딩 중

[논문리뷰] Simulstream: Open-Source Toolkit for Evaluation and Demonstration of Streaming Speech-to-Text Translation Systems

댓글 수 로딩 중

[논문리뷰] Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation

댓글 수 로딩 중

[논문리뷰] Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

댓글 수 로딩 중

[논문리뷰] MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments

댓글 수 로딩 중

[논문리뷰] Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

댓글 수 로딩 중

[논문리뷰] SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories

댓글 수 로딩 중

[논문리뷰] Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

댓글 수 로딩 중

[논문리뷰] ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement

댓글 수 로딩 중

[논문리뷰] H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos

댓글 수 로딩 중

[논문리뷰] InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

댓글 수 로딩 중

[논문리뷰] IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting

댓글 수 로딩 중

[논문리뷰] SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting

댓글 수 로딩 중

[논문리뷰] VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] Decouple to Generalize: Context-First Self-Evolving Learning for Data-Scarce Vision-Language Reasoning

댓글 수 로딩 중

[논문리뷰] Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks

댓글 수 로딩 중

[논문리뷰] SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs

댓글 수 로딩 중

[논문리뷰] Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates

댓글 수 로딩 중

[논문리뷰] Script: Graph-Structured and Query-Conditioned Semantic Token Pruning for Multimodal Large Language Models

댓글 수 로딩 중

[논문리뷰] Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout

댓글 수 로딩 중

[논문리뷰] UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios

댓글 수 로딩 중

[논문리뷰] SyncMV4D: Synchronized Multi-view Joint Diffusion of Appearance and Motion for Hand-Object Interaction Synthesis

댓글 수 로딩 중

[논문리뷰] VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation

댓글 수 로딩 중

[논문리뷰] Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models

댓글 수 로딩 중

[논문리뷰] REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding

댓글 수 로딩 중

[논문리뷰] AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models

댓글 수 로딩 중

[논문리뷰] A Brain Wave Encodes a Thousand Tokens: Modeling Inter-Cortical Neural Interactions for Effective EEG-based Emotion Recognition

댓글 수 로딩 중

[논문리뷰] MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

댓글 수 로딩 중

[논문리뷰] MicroVQA++: High-Quality Microscopy Reasoning Dataset with Weakly Supervised Graphs for Multimodal Large Language Model

댓글 수 로딩 중

[논문리뷰] CATS-V2V: A Real-World Vehicle-to-Vehicle Cooperative Perception Dataset with Complex Adverse Traffic Scenarios

댓글 수 로딩 중

[논문리뷰] MuSc-V2: Zero-Shot Multimodal Industrial Anomaly Classification and Segmentation with Mutual Scoring of Unlabeled Samples

댓글 수 로딩 중

[논문리뷰] MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

댓글 수 로딩 중

[논문리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

댓글 수 로딩 중

[논문리뷰] TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

댓글 수 로딩 중

[논문리뷰] DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

댓글 수 로딩 중

[논문리뷰] When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs

댓글 수 로딩 중

[논문리뷰] ChartM^3: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension

댓글 수 로딩 중

[논문리뷰] left|,circlearrowright,text{BUS},right|: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles

댓글 수 로딩 중

[논문리뷰] How Far Are Surgeons from Surgical World Models? A Pilot Study on Zero-shot Surgical Video Generation with Expert Assessment

댓글 수 로딩 중

[논문리뷰] MisSynth: Improving MISSCI Logical Fallacies Classification with Synthetic Data

댓글 수 로딩 중

[논문리뷰] Mask-to-Height: A YOLOv11-Based Architecture for Joint Building Instance Segmentation and Height Classification from Satellite Imagery

댓글 수 로딩 중

[논문리뷰] Limits of Generalization in RLVR: Two Case Studies in Mathematical Reasoning

댓글 수 로딩 중

[논문리뷰] MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs

댓글 수 로딩 중

[논문리뷰] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

댓글 수 로딩 중

[논문리뷰] LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation

댓글 수 로딩 중

[논문리뷰] ALICE-LRI: A General Method for Lossless Range Image Generation for Spinning LiDAR Sensors without Calibration Metadata

댓글 수 로딩 중

[논문리뷰] ComProScanner: A multi-agent based framework for composition-property structured data extraction from scientific literature

댓글 수 로딩 중

[논문리뷰] KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints

댓글 수 로딩 중

[논문리뷰] BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

댓글 수 로딩 중

[논문리뷰] AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library

댓글 수 로딩 중

[논문리뷰] Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism

댓글 수 로딩 중

[논문리뷰] PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold

댓글 수 로딩 중

[논문리뷰] Balanced Multi-Task Attention for Satellite Image Classification: A Systematic Approach to Achieving 97.23% Accuracy on EuroSAT Without Pre-Training

댓글 수 로딩 중

[논문리뷰] Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models

댓글 수 로딩 중

[논문리뷰] ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

댓글 수 로딩 중

[논문리뷰] Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation

댓글 수 로딩 중

[논문리뷰] VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation

댓글 수 로딩 중

[논문리뷰] MoM: Mixtures of Scenario-Aware Document Memories for Retrieval-Augmented Generation Systems

댓글 수 로딩 중

[논문리뷰] Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain

댓글 수 로딩 중

[논문리뷰] CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving

댓글 수 로딩 중

[논문리뷰] ReFIne: A Framework for Trustworthy Large Reasoning Models with Reliability, Faithfulness, and Interpretability

댓글 수 로딩 중

[논문리뷰] Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation

댓글 수 로딩 중

[논문리뷰] Bridging Reasoning to Learning: Unmasking Illusions using Complexity Out of Distribution Generalization

댓글 수 로딩 중

[논문리뷰] A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks

댓글 수 로딩 중

[논문리뷰] Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

댓글 수 로딩 중

[논문리뷰] MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

댓글 수 로딩 중

[논문리뷰] Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints

댓글 수 로딩 중

[논문리뷰] Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

댓글 수 로딩 중

[논문리뷰] D^3QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection

댓글 수 로딩 중

[논문리뷰] AlphaApollo: Orchestrating Foundation Models and Professional Tools into a Self-Evolving System for Deep Agentic Reasoning

댓글 수 로딩 중

[논문리뷰] Revisiting Modeling and Evaluation Approaches in Speech Emotion Recognition: Considering Subjectivity of Annotators and Ambiguity of Emotions

댓글 수 로딩 중

[논문리뷰] HalluGuard: Evidence-Grounded Small Reasoning Models to Mitigate Hallucinations in Retrieval-Augmented Generation

댓글 수 로딩 중

[논문리뷰] Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models

댓글 수 로딩 중

[논문리뷰] Benchmark It Yourself (BIY): Preparing a Dataset and Benchmarking AI Models for Scatterplot-Related Tasks

댓글 수 로딩 중

[논문리뷰] EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty

댓글 수 로딩 중

[논문리뷰] AdvEvo-MARL: Shaping Internalized Safety through Adversarial Co-Evolution in Multi-Agent Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned

댓글 수 로딩 중

[논문리뷰] DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

댓글 수 로딩 중

[논문리뷰] Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum

댓글 수 로딩 중

[논문리뷰] Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models

댓글 수 로딩 중

[논문리뷰] A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

댓글 수 로딩 중

[논문리뷰] Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval

댓글 수 로딩 중

[논문리뷰] No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping

댓글 수 로딩 중

[논문리뷰] Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation

댓글 수 로딩 중

[논문리뷰] Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition

댓글 수 로딩 중

[논문리뷰] Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution

댓글 수 로딩 중

[논문리뷰] MI-Fuse: Label Fusion for Unsupervised Domain Adaptation with Closed-Source Large-Audio Language Model

댓글 수 로딩 중

[논문리뷰] Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications

댓글 수 로딩 중

[논문리뷰] When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs

댓글 수 로딩 중

[논문리뷰] FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions

댓글 수 로딩 중

[논문리뷰] DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context

댓글 수 로딩 중

[논문리뷰] Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

댓글 수 로딩 중

[논문리뷰] Do You Hear What I Mean? Quantifying the Instruction-Perception Gap in Instruction-Guided Expressive Text-To-Speech Systems

댓글 수 로딩 중

[논문리뷰] Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding

댓글 수 로딩 중

[논문리뷰] MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook

댓글 수 로딩 중

[논문리뷰] Multiple Instance Learning Framework with Masked Hard Instance Mining for Gigapixel Histopathology Image Analysis

댓글 수 로딩 중

[논문리뷰] Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge

댓글 수 로딩 중

[논문리뷰] LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

댓글 수 로딩 중

[논문리뷰] Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding

댓글 수 로딩 중

[논문리뷰] The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward

댓글 수 로딩 중

[논문리뷰] Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis

댓글 수 로딩 중

[논문리뷰] AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] MedVista3D: Vision-Language Modeling for Reducing Diagnostic Errors in 3D CT Disease Detection, Understanding and Reporting

댓글 수 로딩 중

[논문리뷰] LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation

댓글 수 로딩 중

[논문리뷰] Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

댓글 수 로딩 중

[논문리뷰] C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection

댓글 수 로딩 중

[논문리뷰] Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation

댓글 수 로딩 중

[논문리뷰] AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models

댓글 수 로딩 중

[논문리뷰] R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

댓글 수 로딩 중

[논문리뷰] Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD

댓글 수 로딩 중

[논문리뷰] OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models

댓글 수 로딩 중

[논문리뷰] MotionFlux: Efficient Text-Guided Motion Generation through Rectified Flow Matching and Preference Alignment

댓글 수 로딩 중

[논문리뷰] MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation

댓글 수 로딩 중

[논문리뷰] CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

댓글 수 로딩 중

[논문리뷰] Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

댓글 수 로딩 중

[논문리뷰] On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

댓글 수 로딩 중

[논문리뷰] CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection

댓글 수 로딩 중

[논문리뷰] Copyright Protection for Large Language Models: A Survey of Methods, Challenges, and Trends

댓글 수 로딩 중

[논문리뷰] From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms

댓글 수 로딩 중

[논문리뷰] When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing

댓글 수 로딩 중

[논문리뷰] Can LLM-Generated Textual Explanations Enhance Model Classification Performance? An Empirical Study

댓글 수 로딩 중

[논문리뷰] WGAST: Weakly-Supervised Generative Network for Daily 10 m Land Surface Temperature Estimation via Spatio-Temporal Fusion

댓글 수 로딩 중

[논문리뷰] UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation

댓글 수 로딩 중

[논문리뷰] Bridging Theory and Practice in Quantum Game Theory: Optimized Implementation of the Battle of the Sexes with Error Mitigation on NISQ Hardware

댓글 수 로딩 중

[논문리뷰] VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding

댓글 수 로딩 중

[논문리뷰] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

댓글 수 로딩 중

[논문리뷰] Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling

댓글 수 로딩 중

[논문리뷰] InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities

댓글 수 로딩 중

[논문리뷰] Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability

댓글 수 로딩 중

[논문리뷰] RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization

댓글 수 로딩 중

[논문리뷰] TRACEALIGN -- Tracing the Drift: Attributing Alignment Failures to Training-Time Belief Sources in LLMs

댓글 수 로딩 중

[논문리뷰] AlignGuard-LoRA: Alignment-Preserving Fine-Tuning via Fisher-Guided Decomposition and Riemannian-Geodesic Collision Regularization

댓글 수 로딩 중

[논문리뷰] Beyond Linear Bottlenecks: Spline-Based Knowledge Distillation for Culturally Diverse Art Style Classification

댓글 수 로딩 중