#Red-Teaming

2개의 포스트

[논문리뷰] DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

본 논문은 복잡한 워크플로우를 자동화하는 AI 에이전트의 보안 위협을 체계적으로 평가할 수 있는 표준화된 플랫폼과 벤치마크의 부재 문제를 해결합니다.

#Review #AI Agents #Red-Teaming #Safety Evaluation #Agentic Systems #Security Risk Assessment

2026년 5월 10일

[논문리뷰] T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search

기존 LLM red-teaming 연구는 주로 모델에서 유해한 텍스트 출력(harmful text outputs)을 유도하는 데 초점을 맞추었으나, 이는 Model Context Protocol (MCP)과 같은 통합 표준을 통해 다단계 도구 실행(multi-step tool execution)이 가능한 LLM Agents의 새로운 안전 위험을 간과하고 있습니다.

#Review #LLM Agents #Red-Teaming #Vulnerability Discovery #Trajectory-aware Search #MAP-Elites #Tool Call Graph #Attack Realization Rate

2026년 3월 25일