[논문리뷰] GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance EngineersZhiyang Chen이 arXiv에 게시한 'GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers' 논문에 대한 자세한 리뷰입니다.#Review#Autonomous Bug Discovery#Large Language Models#Game Benchmark#Quality Assurance#Multi-agent System#Software Engineering2026년 4월 7일댓글 수 로딩 중