본문으로 건너뛰기

#Robustness

43개의 포스트

[논문리뷰] Measuring the Depth of LLM Unlearning via Activation Patching

댓글 수 로딩 중

[논문리뷰] Recovering Policy-Induced Errors: Benchmarking and Trajectory Synthesis for Robust GUI Agents

댓글 수 로딩 중

[논문리뷰] SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

댓글 수 로딩 중

[논문리뷰] StableVLA: Towards Robust Vision-Language-Action Models without Extra Data

댓글 수 로딩 중

[논문리뷰] Sparse Autoencoders as Plug-and-Play Firewalls for Adversarial Attack Detection in VLMs

댓글 수 로딩 중

[논문리뷰] Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers

댓글 수 로딩 중

[논문리뷰] RadAgent: A tool-using AI agent for stepwise interpretation of chest computed tomography

댓글 수 로딩 중

[논문리뷰] VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics

댓글 수 로딩 중

[논문리뷰] CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

댓글 수 로딩 중

[논문리뷰] SAM 3D Body: Robust Full-Body Human Mesh Recovery

댓글 수 로딩 중

[논문리뷰] VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model

댓글 수 로딩 중

[논문리뷰] Multi-Task GRPO: Reliable LLM Reasoning Across Tasks

댓글 수 로딩 중

[논문리뷰] M-ErasureBench: A Comprehensive Multimodal Evaluation Benchmark for Concept Erasure in Diffusion Models

댓글 수 로딩 중

[논문리뷰] TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior

댓글 수 로딩 중

[논문리뷰] LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics

댓글 수 로딩 중

[논문리뷰] Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

댓글 수 로딩 중

[논문리뷰] Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs

댓글 수 로딩 중

[논문리뷰] Thinking with Programming Vision: Towards a Unified View for Thinking with Images

댓글 수 로딩 중

[논문리뷰] DiG-Flow: Discrepancy-Guided Flow Matching for Robust VLA Models

댓글 수 로딩 중

[논문리뷰] Towards Robust Mathematical Reasoning

댓글 수 로딩 중

[논문리뷰] Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?

댓글 수 로딩 중

[논문리뷰] Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD

댓글 수 로딩 중

[논문리뷰] Processing and acquisition traces in visual encoders: What does CLIP know about your camera?

댓글 수 로딩 중

[논문리뷰] AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving

댓글 수 로딩 중

[논문리뷰] Attention Sinks in Diffusion Language Models

댓글 수 로딩 중

[논문리뷰] Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense

댓글 수 로딩 중

[논문리뷰] MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification

댓글 수 로딩 중