[논문리뷰] A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5Yutao Wu이 arXiv에 게시한 'A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5' 논문에 대한 자세한 리뷰입니다.#Review#AI Safety#Large Language Models#Multimodal LLMs#Benchmark Evaluation#Adversarial Robustness#Multilingual Evaluation#Regulatory Compliance#Image Generation Safety2026년 1월 15일댓글 수 로딩 중
[논문리뷰] BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic DomainsarXiv에 게시된 'BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains' 논문에 대한 자세한 리뷰입니다.#Review#Large Language Models (LLMs)#Benchmark#Indic Languages#Multilingual Evaluation#Domain-Specific AI#India-centric Knowledge Systems#Zero-Shot Learning#Question Answering2025년 10월 30일댓글 수 로딩 중
[논문리뷰] UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image GenerationYujie Zhou이 arXiv에 게시한 'UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation' 논문에 대한 자세한 리뷰입니다.#Review#Text-to-Image Generation#Semantic Evaluation#Benchmark#Multilingual Evaluation#Fine-grained Assessment#Large Language Models#Model Evaluation#Prompt Engineering2025년 10월 22일댓글 수 로딩 중