#GPU Kernel

3개의 포스트

[sglang] SGLang MoE 라우팅 최적화: AMD GPU에서 aiter.biased_grouped_topk 활용

AMD GPU에서 MoE 라우팅의 sigmoid 스코어링을 위한 커널 최적화로 처리량 2.4% 향상.

#SGLang #MoE #AMD GPU #최적화 #성능 #AIter #GPU Kernel

2026년 4월 25일

[논문리뷰] Evaluation-driven Scaling for Scientific Discovery

본 논문은 과학적 발견 과정에서 LLM을 활용한 Trial-and-error 루프의 확장성(Scaling) 문제를 공식화하고 이를 체계적으로 해결하고자 합니다.

#Review #Test-Time Scaling #Scientific Discovery #Evaluation-driven Discovery #LLM #Optimization #Symbolic Laws #GPU Kernel

2026년 4월 21일

[SGLang] Triton Attention 커널: Python으로 작성하는 GPU 커널

SGLang의 Triton Attention 백엔드를 분석한다. Python으로 GPU 커널을 작성하는 Triton의 장점, Prefill/Decode/Extend 각 단계별 커널 구현을 코드와 함께 살펴본다.

#sglang #Triton #GPU Kernel #Attention Kernel

2026년 4월 11일