[논문리뷰] IQuest-Coder-V1 Technical Report

2026년 3월 26일수정: 2026년 3월 26일

링크: 논문 PDF로 바로 열기

The browsing was successful. Now I need to extract the information as per the requirements.

Authors : Jian Yang, Wei Zhang, Shawn Guo, Zhengmao Ye, Lin Jing, Shark Liu, Yizhi Li, Jiajun Wu, Cening Liu, X. Ma, Yuyang Song, Siwei Wu, Yuwen Li, L. Liao, T. Zheng, Ziling Huang, Zelong Huang, Che Liu, Yan Xing, Renyuan Li, Qingsong Cai, Hanxu Yan, Siyue Wang, Shikai Li, Jason Klein Liu, An Huang, Yongsheng Kang, Jinxing Zhang, Chuan Hao, Haowen Wang, Weicheng Gu, Ran Tao, Mingjie Tang, Peihao Wu, Jianzhou Wang, Xianglong Liu, Weifeng Lv, Bryan Dai. (From "6 Contributions and Acknowledgements") - I will list the core contributors and et al.

Keywords : I will identify 5-8 key technical terms from the abstract and introduction.

IQuest-Coder-V1
Code Large Language Models (LLMs)
Code-Flow Multi-Stage Training Paradigm
Agentic Software Engineering
Competitive Programming
Complex Tool Use
Recurrent Mechanism (Loop variant)
Reinforcement Learning (RL)

Figures for Part 2 :

Figure 2 : "Code-Flow Training pipeline of IQuest-Coder-V1." This seems like the overall architecture. 2603.16733v1/x3.png
Table 3 : "Performance comparison on code generation tasks." This shows key quantitative results. I should look for an image for this, but it's an HTML table. The instructions say "Table is an image not HTML table". So, I should skip HTML tables.
Figure 1 : "IQuest-Coder-V1 performance across different benchmarks." This is a good overview figure. 2603.16733v1/x2.png
Figure 3 : "IQuest-Coder-V1 performance across different benchmarks." This is another performance figure. 2603.16733v1/x4.png

I will choose Figure 2 (architecture) and Figure 1 (performance overview), and Figure 3 (another performance overview).

Let's start drafting the summary.

Refining Keywords : IQuest-Coder-V1, Code LLMs, Code-Flow Training, Agentic Systems, Multi-stage Training, Recurrent Architecture, Supervised Fine-Tuning, Reinforcement Learning

Part 1: Summary Body

1. Key Terms & Definitions

IQuest-Coder-V1 : 코드 Large Language Models (LLMs) 시리즈로, 7B, 14B, 40B, 그리고 40B-Loop 파라미터 크기를 포함한다.
Code-Flow Multi-stage Training Paradigm : 정적 코드 표현을 넘어, 소프트웨어 로직의 동적 진화를 파이프라인의 여러 단계를 통해 포착하는 훈련 패러다임이다.
LoopCoder Architecture : 공유 파라미터를 가진 Transformer 블록을 두 번의 고정된 반복으로 실행하는 순환(recurrent) Transformer 디자인으로, 모델 Capacity와 Deployment Footprint 간의 Trade-off를 최적화한다.
Fill-In-the-Middle (FIM) : 코드 문서의 Prefix와 Suffix를 기반으로 Middle Content를 예측하도록 훈련하는 코드 완성(Code Completion) 기법이다.
Agentic Trajectories : 모델이 환경 피드백(명령, 로그, 오류, 테스트 결과)이 있는 완전한 Action-Observation-Revision Cycle에 노출되어 End-to-End 작업 계획 및 오류 복구 능력을 학습하도록 돕는 데이터이다.

2. Motivation & Problem Statement

기존 Large Language Models (LLMs)는 도메인 특화를 통해 일반적인 지능을 크게 향상시켰지만, 코드 지능 분야에서는 Claude 4.5 Sonnet 과 같은 독점적인 선두 모델들과 오픈-웨이트 모델들 사이에 여전히 큰 격차가 존재한다. 이러한 격차는 특히 Long-Horizon Reasoning 및 복잡한 Multi-File Codebase를 탐색하는 능력에서 두드러진다. 기존 연구들의 한계는 정적 코드 표현에 머물러 소프트웨어 로직의 동적 진화를 충분히 반영하지 못하는 점이다. 이에 IQuest-Coder-V1 연구진은 Code-Flow Multi-stage Training Paradigm 을 제안하여 이러한 격차를 해소하고, 논리적 진화의 구조화된 다단계 접근 방식을 통해 지능 밀도를 극대화하고자 한다.

3. Method & Key Results

IQuest-Coder-V1은 Code-Flow 파이프라인을 기반으로 개발되었으며 [Figure 2], 이는 4가지 핵심 단계로 구성된다: Pre-training & High-Quality Annealing , Dual-Phase Mid-training , Bifurcated Post-training , 그리고 Efficient Architectures . Pre-training은 일반 데이터와 코드 데이터를 혼합한 2단계 프로세스로 시작하며, 이후 고품질 코드 Corpus를 활용한 Annealing 단계를 거친다. Mid-training 단계에서는 32k Context의 Reasoning 및 Agentic Trajectories 데이터와 128k Context의 Repository-scale 데이터를 통합하여 깊은 논리적 기반을 구축한다. Post-training은 Instruction Tuning을 위한 Instruct Path 와 Reasoning-driven RL을 활용하는 Thinking Path 로 분기되어 특정 사용 사례에 최적화된 모델을 제공한다. 특히, IQuest-Coder-V1-Loop variant는 Recurrent Mechanism을 도입하여 모델 Capacity와 Deployment Footprint 간의 Trade-off를 최적화한다.

실험 결과, IQuest-Coder-V1 시리즈는 다양한 코드 인텔리전스 벤치마크에서 SOTA 성능을 달성했다. CrossCodeEval 태스크에서 IQuest-Coder-V1-40B 모델은 Python, Java, TypeScript, C# 평균 57.8 EM 및 85.7 ES 를 기록하여 경쟁 모델 대비 우수한 Cross-File Code Completion 능력을 보였다. Code Generation 태스크인 HumanEval+ 벤치마크에서 IQuest-Coder-V1-40B-Loop-Instruct 모델은 91.5 점수를 달성했으며, SWE-Verified 벤치마크에서는 76.2 점수를 기록하며 Real-world 소프트웨어 엔지니어링 문제 해결 능력을 입증했다. 또한, Code Reasoning 태스크인 LiveCodeBench v6 에서 IQuest-Coder-V1-40B-Loop-Thinking 모델은 81.1 의 높은 점수를 기록하여 Agentic Software Engineering, Competitive Programming, 그리고 Complex Tool Use 전반에서 뛰어난 성능을 보였다 [Figure 1, Figure 3].

4. Conclusion & Impact

이 연구는 IQuest-Coder-V1 이라는 코드 LLM 제품군을 선보이며, Code-Flow Pre-training Paradigm 과 다단계 Evolutionary Training을 통해 자율 소프트웨어 엔지니어링 분야의 SOTA를 발전시켰다. 동적 Repository Transition을 포착하고 Mid-training 단계에서 Repository-scale Context를 포함하는 광범위한 Reasoning Trajectories를 통합함으로써, 모델은 복잡한 코드 인텔리전스 태스크를 위한 견고한 논리적 기반을 확립했다. 특히 IQuest-Coder-V1-Loop variant는 Recurrent Architectural Innovation을 통해 Capacity-Efficiency Trade-off를 최적화하여 실용적인 Deployment 도전을 해결했다. 이 연구는 전체 Training Pipeline과 Model Checkpoints를 공개함으로써, 코드 인텔리전스 연구를 촉진하고 Real-world 소프트웨어 엔지니어링 과제를 해결할 수 있는 Production-ready Agentic Systems 개발을 가속화하는 데 기여할 것으로 기대된다.

⚠️ 알림: 이 리뷰는 AI로 작성되었습니다.

Review 의 다른글

이전글 [논문리뷰] FinMCP-Bench: Benchmarking LLM Agents for Real-World Financial Tool Use under the Model Context Protocol
현재글 : [논문리뷰] IQuest-Coder-V1 Technical Report
다음글 [논문리뷰] Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale