본문으로 건너뛰기

#Code Generation

86개의 포스트

[논문리뷰] From Runnable to Shippable: Multi-Agent Test-Driven Development for Generating Full-Stack Web Applications from Requirements

댓글 수 로딩 중

[논문리뷰] Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

댓글 수 로딩 중

[논문리뷰] PlayCoder: Making LLM-Generated GUI Code Playable

댓글 수 로딩 중

[논문리뷰] QuantCode-Bench: A Benchmark for Evaluating the Ability of Large Language Models to Generate Executable Algorithmic Trading Strategies

댓글 수 로딩 중

[논문리뷰] Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines

댓글 수 로딩 중

[논문리뷰] Embarrassingly Simple Self-Distillation Improves Code Generation

댓글 수 로딩 중

[논문리뷰] Think Anywhere in Code Generation

댓글 수 로딩 중

[논문리뷰] Code-Space Response Oracles: Generating Interpretable Multi-Agent Policies with Large Language Models

댓글 수 로딩 중

[논문리뷰] MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants

댓글 수 로딩 중

[논문리뷰] MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

댓글 수 로딩 중

[논문리뷰] CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

댓글 수 로딩 중

[논문리뷰] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

댓글 수 로딩 중

[논문리뷰] Qwen3-Coder-Next Technical Report

댓글 수 로딩 중

[논문리뷰] LongCLI-Bench: A Preliminary Benchmark and Study for Long-horizon Agentic Programming in Command-Line Interfaces

댓글 수 로딩 중

[논문리뷰] Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

댓글 수 로딩 중

[논문리뷰] DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels

댓글 수 로딩 중

[논문리뷰] Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation

댓글 수 로딩 중

[논문리뷰] Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

댓글 수 로딩 중

[논문리뷰] Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

댓글 수 로딩 중

[논문리뷰] daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently

댓글 수 로딩 중

[논문리뷰] MARS: Modular Agent with Reflective Search for Automated AI Research

댓글 수 로딩 중

[논문리뷰] TAM-Eval: Evaluating LLMs for Automated Unit Test Maintenance

댓글 수 로딩 중

[논문리뷰] OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models

댓글 수 로딩 중

[논문리뷰] Reinforcement Learning via Self-Distillation

댓글 수 로딩 중

[논문리뷰] Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey

댓글 수 로딩 중

[논문리뷰] ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

댓글 수 로딩 중

[논문리뷰] Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization

댓글 수 로딩 중

[논문리뷰] MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

댓글 수 로딩 중

[논문리뷰] UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models

댓글 수 로딩 중

[논문리뷰] SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories

댓글 수 로딩 중

[논문리뷰] Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

댓글 수 로딩 중

[논문리뷰] DEER: Draft with Diffusion, Verify with Autoregressive Models

댓글 수 로딩 중

[논문리뷰] DeepCode: Open Agentic Coding

댓글 수 로딩 중

[논문리뷰] Thinking with Programming Vision: Towards a Unified View for Thinking with Images

댓글 수 로딩 중

[논문리뷰] Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?

댓글 수 로딩 중

[논문리뷰] WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation

댓글 수 로딩 중

[논문리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

댓글 수 로딩 중

[논문리뷰] DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

댓글 수 로딩 중

[논문리뷰] Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper

댓글 수 로딩 중

[논문리뷰] VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

댓글 수 로딩 중

[논문리뷰] SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?

댓글 수 로딩 중

[논문리뷰] RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

댓글 수 로딩 중

[논문리뷰] Universal Deep Research: Bring Your Own Model and Strategy

댓글 수 로딩 중

[논문리뷰] VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models

댓글 수 로딩 중

[논문리뷰] Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

댓글 수 로딩 중

[논문리뷰] GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

댓글 수 로딩 중

[논문리뷰] LaTCoder: Converting Webpage Design to Code with Layout-as-Thought

댓글 수 로딩 중

[논문리뷰] Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

댓글 수 로딩 중

[논문리뷰] CellForge: Agentic Design of Virtual Cell Models

댓글 수 로딩 중

[논문리뷰] JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

댓글 수 로딩 중

[논문리뷰] VisCoder2: Building Multi-Language Visualization Coding Agents

댓글 수 로딩 중

[논문리뷰] BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

댓글 수 로딩 중

[논문리뷰] FinSight: Towards Real-World Financial Deep Research

댓글 수 로딩 중

[논문리뷰] Code2Video: A Code-centric Paradigm for Educational Video Generation

댓글 수 로딩 중