본문으로 건너뛰기

#LLM

180개의 포스트

[논문리뷰] Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs

댓글 수 로딩 중

[논문리뷰] OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

댓글 수 로딩 중

[논문리뷰] ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence

댓글 수 로딩 중

[논문리뷰] Efficient and Scalable Provenance Tracking for LLM-Generated Code Snippets

댓글 수 로딩 중

[논문리뷰] ETCHR: Editing To Clarify and Harness Reasoning

댓글 수 로딩 중

[논문리뷰] F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking

댓글 수 로딩 중

[논문리뷰] DeonticBench: A Benchmark for Reasoning over Rules

댓글 수 로딩 중

[논문리뷰] TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

댓글 수 로딩 중

[논문리뷰] Friends and Grandmothers in Silico: Localizing Entity Cells in Language Models

댓글 수 로딩 중

[논문리뷰] CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

댓글 수 로딩 중

[논문리뷰] Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators

댓글 수 로딩 중

[논문리뷰] veScale-FSDP: Flexible and High-Performance FSDP at Scale

댓글 수 로딩 중

[논문리뷰] ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] On Data Engineering for Scaling LLM Terminal Capabilities

댓글 수 로딩 중

[논문리뷰] Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision

댓글 수 로딩 중

[논문리뷰] ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression

댓글 수 로딩 중

[논문리뷰] F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

댓글 수 로딩 중

[논문리뷰] Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

댓글 수 로딩 중

[논문리뷰] Rethinking the Trust Region in LLM Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] SPARKLING: Balancing Signal Preservation and Symmetry Breaking for Width-Progressive Learning

댓글 수 로딩 중

[논문리뷰] TAM-Eval: Evaluating LLMs for Automated Unit Test Maintenance

댓글 수 로딩 중

[논문리뷰] A Hybrid Protocol for Large-Scale Semantic Dataset Generation in Low-Resource Languages: The Turkish Semantic Relations Corpus

댓글 수 로딩 중

[논문리뷰] Distilling Feedback into Memory-as-a-Tool

댓글 수 로딩 중

[논문리뷰] Web World Models

댓글 수 로딩 중

[논문리뷰] Confucius Code Agent: An Open-sourced AI Software Engineer at Industrial Scale

댓글 수 로딩 중

[논문리뷰] DeepCode: Open Agentic Coding

댓글 수 로딩 중

[논문리뷰] Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

댓글 수 로딩 중

[논문리뷰] SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization

댓글 수 로딩 중

[논문리뷰] Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks

댓글 수 로딩 중

[논문리뷰] RDMA Point-to-Point Communication for LLM Systems

댓글 수 로딩 중

[논문리뷰] X-CoT: Explainable Text-to-Video Retrieval via LLM-based Chain-of-Thought Reasoning

댓글 수 로딩 중

[논문리뷰] WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] EconProver: Towards More Economical Test-Time Scaling for Automated Theorem Proving

댓글 수 로딩 중

[논문리뷰] HANRAG: Heuristic Accurate Noise-resistant Retrieval-Augmented Generation for Multi-hop Question Answering

댓글 수 로딩 중

[논문리뷰] Saturation-Driven Dataset Generation for LLM Mathematical Reasoning in the TPTP Ecosystem

댓글 수 로딩 중

[논문리뷰] Inverse-LLaVA: Eliminating Alignment Pre-training Through Text-to-Vision Mapping

댓글 수 로딩 중

[논문리뷰] AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving

댓글 수 로딩 중

[논문리뷰] WideSearch: Benchmarking Agentic Broad Info-Seeking

댓글 수 로딩 중

[논문리뷰] OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks

댓글 수 로딩 중

[논문리뷰] LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

댓글 수 로딩 중

[논문리뷰] Deflanderization for Game Dialogue: Balancing Character Authenticity with Task Execution in LLM-based NPCs

댓글 수 로딩 중

[논문리뷰] AInstein: Assessing the Feasibility of AI-Generated Approaches to Research Problems

댓글 수 로딩 중

[논문리뷰] Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism

댓글 수 로딩 중

[논문리뷰] QueST: Incentivizing LLMs to Generate Difficult Problems

댓글 수 로딩 중

[논문리뷰] Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap

댓글 수 로딩 중

[논문리뷰] Knowledge Homophily in Large Language Models

댓글 수 로딩 중