최신 포스트

[논문리뷰] Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

본 논문은 다양한 과학 도메인에 걸쳐 멀티모달 이해 및 추론 을 발전시키고, 동시에 일반 비전 태스크에서 우수한 성능을 유지하는 과학 멀티모달 대규모 언어 모델(MLLM) 인 Innovator-VL을 제시합니다.

#Review #Multimodal LLM #Scientific AI #Data Efficiency #Reinforcement Learning #Vision-Language Model #Scientific Reasoning #Reproducible AI

2026년 1월 28일

[논문리뷰] Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

대규모 언어 모델(LLMs)의 수학적 추론 능력을 강화하기 위해 기존 RLVR(Reinforcement Learning with Verifiable Rewards) 방법론이 어려운 문제에 대한 학습을 충분히 다루지 못하는 한계를 해결하는 것을 목표로 합니다.

#Review #Reinforcement Learning #Mathematical Reasoning #Difficulty-Aware Optimization #Data Augmentation #Policy Optimization #LLMs #GRPO #MQR

2026년 1월 28일

[논문리뷰] GDCNet: Generative Discrepancy Comparison Network for Multimodal Sarcasm Detection

본 논문은 이미지-텍스트 쌍에서 풍자(sarcasm)를 효과적으로 탐지하기 위해 기존 방법론의 한계를 극복하는 것을 목표로 합니다.

#Review #Multimodal Sarcasm Detection #Large Language Models #Multimodal LLMs #Discrepancy Modeling #Image Captioning #Gated Fusion #Semantic Incongruity

2026년 1월 28일

[논문리뷰] DeepSeek-OCR 2: Visual Causal Flow

본 논문은 기존 Vision-Language Model (VLM) 이 시각 토큰을 고정된 래스터 스캔 순서로 처리하여 인간의 유연한 시각 인지 방식과 상충하는 문제를 해결하고자 합니다.

#Review #OCR #Vision-Language Model #Causal Reasoning #Transformer Architecture #Attention Mechanism #Document Understanding #DeepEncoder

2026년 1월 28일

[논문리뷰] Advancing Open-source World Models

본 논문은 기존 비디오 생성 모델의 한계(데이터 희소성, 장기 일관성 부족, 실시간 상호작용의 어려움, 독점적 솔루션)를 극복하고, 가상 세계의 역학을 학습하며 실시간으로 렌더링할 수 있는 오픈 소스 세계 모델(world model) 인 LingBot-World를 개발하는 것을 목표로 합니다.

#Review #World Models #Open-source AI #Video Generation #Real-time Simulation #Long-term Memory #Action-Conditioned Learning #Generative Models #Embodied AI

2026년 1월 28일

[uvloop] uvloop 성능 최적화: Python C API를 활용한 Context 진입/탈출 개선

Python의 context.run() 대신 C API를 직접 호출하여 오버헤드를 줄이고 성능을 개선한 사례를 분석합니다.

#uvloop #Python #Performance #Cython #C-API

2026년 1월 28일

[triton] NVIDIA TMA im2col 모드 드라이버 지원

NVIDIA TMA의 im2col 모드를 위한 Python 드라이버 레벨 지원을 추가한 PR을 분석합니다. cuTensorMapEncodeIm2col API 바인딩과 descriptor 생성 로직을 살펴봅니다.

#Triton #NVIDIA #TMA #Im2col #Driver

2026년 1월 28일

[Loki] 인덱스 빌더 크기 추정 최적화: 반복 계산 제거로 97% 성능 개선

Grafana Loki의 데이터 객체 인덱스 빌더에서 매번 모든 테넌트를 순회하며 크기를 계산하던 방식을 증분 추적으로 변경하여 97%의 성능 향상을 달성한 PR을 분석합니다.

#Grafana Loki #Performance #Go #Index Builder #Optimization

2026년 1월 28일

[pydantic-ai] DBOSAgent에서 병렬 도구 실행 지원 및 실행 모드 API 추가

DBOSAgent에 parallel_ordered_events 모드를 도입하여 결정론적 리플레이를 보장하면서도 병렬 도구 실행을 가능하게 한 사례를 분석합니다.

#pydantic-ai #DBOS #Parallel Execution #Durable Execution #API Design

2026년 1월 28일

[Grafana Loki] Allocator에 동시 접근 감지를 추가하여 메모리 안전성 확보

Arena 스타일 메모리 Allocator에 atomic CAS 기반 동시 접근 감지를 추가하여, 고루틴 간 경합 시 즉시 panic으로 디버깅을 용이하게 한 분석.

#Grafana Loki #Go #Memory Management #Concurrency #Atomic

2026년 1월 28일

[CPython] subprocess.Popen.wait() 이벤트 기반 구현으로 효율성 개선

Linux pidfd_open과 macOS kqueue를 활용해 subprocess.Popen.wait()의 busy loop를 이벤트 기반으로 전환

#Python #CPython #subprocess #System Programming

2026년 1월 28일

[논문리뷰] World Craft: Agentic Framework to Create Visualizable Worlds via Text

본 논문은 프로그래밍 기술이 없는 비전문가도 텍스트 설명을 통해 실행 및 시각화 가능한 AI Town 환경 을 쉽게 만들 수 있도록 하는 것을 목표로 합니다.

#Review #Generative Agents #AI Town #LLM #Environment Creation #Multi-agent System #Spatial Reasoning #Text-to-World #Reverse Synthesis

2026년 1월 27일

[논문리뷰] Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

본 논문은 기존 AI 시스템이 언어적/추상적 영역에서 강세를 보이지만, 풍부한 표현과 사전 지식, 특히 명시적인 시각적 세계 모델링의 부족으로 인해 물리적/공간적 지능 분야에서는 인간에 비해 뒤처지는 문제를 해결하고자 합니다.

#Review #Multimodal AI #World Models #Visual Generation #Chain-of-Thought (CoT)#Multimodal Reasoning #Unified Multimodal Models #Spatial-Physical Reasoning

2026년 1월 27일

[논문리뷰] TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment

본 논문은 대규모 언어 모델(LLM)에서 유해한 콘텐츠 생성을 완화하는 안전성 정렬의 시급한 문제를 다룹니다. 기존 방법론들이 겪는 확장성 한계, 레드 팀 훈련의 엔트로피 붕괴, 방어 모델의 과적합, 그리고 적대적 다양성 부족 문제를 해결하는 것을 목표로 합니다.

#Review #LLM Safety Alignment #Reinforcement Learning #Self-Play #Red Teaming #Adversarial Training #Multi-Role Framework #Reward Hacking Mitigation

2026년 1월 27일

[논문리뷰] Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection

대규모 언어 모델(LLM)이 정렬 노력에도 불구하고 여전히 유해한 행동에 취약하며, 기존 액티베이션 스티어링(Activation Steering) 기법들이 norm 보존 실패 로 인한 생성 붕괴, 세심한 계수 튜닝, 또는 이진 제어 등의 한계를 가진다는 문제점을 해결하고자 합니다.

#Review #Activation Steering #Large Language Models (LLMs)#Norm Preservation #Discriminative Layer Selection #Behavior Control #Inference-time Intervention #Angular Steering

2026년 1월 27일

[논문리뷰] Revisiting Parameter Server in LLM Post-Training

대규모 언어 모델(LLM) 후처리 훈련 과정에서 시퀀스 길이의 높은 편차 로 인해 발생하는 워크로드 불균형 문제 를 해결하는 것이 목표입니다.

#Review #LLM Post-Training #Parameter Server #Distributed Training #FSDP #On-Demand Communication #Workload Imbalance #Communication Optimization #Deep Learning

2026년 1월 27일

[논문리뷰] Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

현재 대규모 언어 모델(LLM)의 스케일링이 한계에 부딪혔으며, 특히 깊이 스케일링은 이론적으로 우수한 표현력을 제공하지만 기존 Transformer 아키텍처는 극심한 깊이에서 안정적으로 훈련하기 어렵습니다.

#Review #Transformer Architecture #Layer Normalization #Depth Scaling #Training Stability #Large Language Models #Gradient Flow #Highway Networks #Post-LayerNorm

2026년 1월 27일

[논문리뷰] HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences

본 논문은 학술 논문, 특히 AI/ML 분야에서 증가하는 환각 인용(HalluCitation) 의 확산과 그 영향을 체계적으로 조사하는 것을 목표로 합니다.

#Review #Hallucinated Citations #NLP Conferences #Citation Detection #Academic Integrity #Peer Review #Large Language Models (LLMs)#Bibliometrics

2026년 1월 27일

[논문리뷰] GPCR-Filter: a deep learning framework for efficient and precise GPCR modulator discovery

GPCR(G protein-coupled receptors) 변조기 발견의 복잡성과 기존 스크리닝 방법론의 한계(느리고 비용이 많이 들며 복잡한 동적 상호작용을 포착하지 못함)를 해결하는 것을 목표로 합니다.

#Review #GPCR #Drug Discovery #Deep Learning #Protein Language Model #Graph Neural Network #Attention Mechanism #Drug Target Interaction #Virtual Screening

2026년 1월 27일

[논문리뷰] FABLE: Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval for Multi-Document Reasoning

본 논문은 장문 컨텍스트 LLM의 'lost-in-the-middle' 현상, 높은 계산 비용, 멀티 도큐먼트 추론 확장성 부족 문제를 해결하고, 기존 RAG 시스템의 의미론적 노이즈 및 구조화된 교차 문서 합성 한계를 극복하는 것을 목표로 합니다.

#Review #RAG #LLM-Enhanced Retrieval #Multi-Document Reasoning #Hierarchical Indexing #Bi-Path Retrieval #Adaptive Retrieval #Knowledge Organization #Context Window Optimization

2026년 1월 27일