[논문리뷰] Claw-Eval: Toward Trustworthy Evaluation of Autonomous AgentsarXiv에 게시된 'Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents' 논문에 대한 자세한 리뷰입니다.#Review#Autonomous Agents#Benchmark#Trajectory-aware Grading#Safety Evaluation#Robustness Testing#Multimodal Perception2026년 4월 7일댓글 수 로딩 중
[논문리뷰] ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and WatchersZejian Chen이 arXiv에 게시한 'ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers' 논문에 대한 자세한 리뷰입니다.#Review#Autonomous Agents#OpenClaw#Security Framework#Watcher Architecture#Safety-Utility Tradeoff#Behavioral Scanning#Runtime Enforcement2026년 4월 1일댓글 수 로딩 중
[논문리뷰] Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5arXiv에 게시된 'Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5' 논문에 대한 자세한 리뷰입니다.#Review#Frontier AI#AI Risk Management#Autonomous Agents#LLM Safety#Cybersecurity#Deception#Self-Replication#Mitigation Frameworks2026년 2월 19일댓글 수 로딩 중
[논문리뷰] AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment RolloutsarXiv에 게시된 'AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts' 논문에 대한 자세한 리뷰입니다.#Review#Long-Context LLMs#Autonomous Agents#Benchmark#Environment Rollouts#State Tracking#Tool Use#Memory Evaluation#Lateral Thinking Puzzles2026년 1월 29일댓글 수 로딩 중
[논문리뷰] Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive SurveyarXiv에 게시된 'Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey' 논문에 대한 자세한 리뷰입니다.#Review#LLM-based Issue Resolution#Software Engineering#Autonomous Agents#Code Generation#Benchmarking#Reinforcement Learning#Supervised Fine-tuning#Multimodal LLMs2026년 1월 20일댓글 수 로딩 중
[논문리뷰] Beyond Static Tools: Test-Time Tool Evolution for Scientific ReasoningarXiv에 게시된 'Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning' 논문에 대한 자세한 리뷰입니다.#Review#Test-Time Tool Evolution#Scientific Reasoning#Large Language Models#Dynamic Tool Synthesis#Tool Adaptation#AI for Science#Autonomous Agents2026년 1월 15일댓글 수 로딩 중
[논문리뷰] User-Oriented Multi-Turn Dialogue Generation with Tool Use at scalearXiv에 게시된 'User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale' 논문에 대한 자세한 리뷰입니다.#Review#Multi-Turn Dialogue Generation#Tool Use#Autonomous Agents#Large Reasoning Models#User Simulation#Synthetic Data Generation#SQL-based Tools#Agentic Benchmarks2026년 1월 13일댓글 수 로딩 중
[논문리뷰] AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous AgentsShixin Jiang이 arXiv에 게시한 'AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents' 논문에 대한 자세한 리뷰입니다.#Review#Autonomous Agents#Memory Systems#Cognitive Neuroscience#Large Language Models (LLMs)#Retrieval-Augmented Generation (RAG)#Memory Management#Multimodal Memory#Agent Skills2025년 12월 31일댓글 수 로딩 중
[논문리뷰] SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social WorldsXuhong He이 arXiv에 게시한 'SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds' 논문에 대한 자세한 리뷰입니다.#Review#Autonomous Agents#Realistic Simulator#Unreal Engine 5#LLM/VLM Agents#Procedural Generation#Multi-Agent Systems#Physical Simulation#Social Interaction2025년 12월 2일댓글 수 로딩 중
[논문리뷰] DeepAgent: A General Reasoning Agent with Scalable ToolsetsJiajie Jin이 arXiv에 게시한 'DeepAgent: A General Reasoning Agent with Scalable Toolsets' 논문에 대한 자세한 리뷰입니다.#Review#Autonomous Agents#Large Language Models#Tool Use#Reinforcement Learning#Memory Management#Tool Retrieval#Agentic Reasoning2025년 10월 27일댓글 수 로딩 중
[논문리뷰] From Masks to Worlds: A Hitchhiker's Guide to World ModelsShufan Li이 arXiv에 게시한 'From Masks to Worlds: A Hitchhiker's Guide to World Models' 논문에 대한 자세한 리뷰입니다.#Review#World Models#Generative AI#Multimodal Learning#Masked Modeling#Interactive AI#Memory Systems#Autonomous Agents#AI Roadmap2025년 10월 24일댓글 수 로딩 중
[논문리뷰] Foundation Models for Scientific Discovery: From Paradigm Enhancement to Paradigm TransitionarXiv에 게시된 'Foundation Models for Scientific Discovery: From Paradigm Enhancement to Paradigm Transition' 논문에 대한 자세한 리뷰입니다.#Review#Foundation Models#Scientific Discovery#Paradigm Shift#Human-AI Collaboration#Autonomous Agents#Meta-Science#Experimental Design#Hypothesis Generation2025년 10월 20일댓글 수 로딩 중
[논문리뷰] DeepTravel: An End-to-End Agentic Reinforcement Learning Framework for Autonomous Travel Planning AgentsarXiv에 게시된 'DeepTravel: An End-to-End Agentic Reinforcement Learning Framework for Autonomous Travel Planning Agents' 논문에 대한 자세한 리뷰입니다.#Review#Agentic Reinforcement Learning#Travel Planning#Large Language Models#Sandbox Environment#Hierarchical Reward Modeling#Experience Replay#Autonomous Agents2025년 10월 9일댓글 수 로딩 중
[논문리뷰] AInstein: Assessing the Feasibility of AI-Generated Approaches to Research ProblemsJose Dolz이 arXiv에 게시한 'AInstein: Assessing the Feasibility of AI-Generated Approaches to Research Problems' 논문에 대한 자세한 리뷰입니다.#Review#LLM#Scientific Problem Solving#AI Research#Iterative Refinement#Autonomous Agents#Generative AI#Evaluation Framework#Problem Extraction2025년 10월 8일댓글 수 로딩 중
[논문리뷰] Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement LearningMaksim Nekrashevich이 arXiv에 게시한 'Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.#Review#Reinforcement Learning#Large Language Models#Software Engineering#Multi-Turn Interaction#Long Context#DAPO#Autonomous Agents#SWE-BENCH2025년 8월 7일댓글 수 로딩 중