#Quantization-Aware Training

3개의 포스트

[논문리뷰] Gemma 4 Technical Report

본 논문은 최신 LLM 생태계에서 요구되는 강력한 multimodal 이해도, 복잡한 추론 능력, 그리고 컴퓨팅 효율성을 동시에 달성하기 위해 Gemma 4 모델 제품군을 제안합니다.

#Review #Multimodal #Mixture-of-Experts #Reasoning Trace #Speculative Decoding #Quantization-Aware Training #Long-context #Encoder-free

2026년 7월 7일

[논문리뷰] SLA2: Sparse-Linear Attention with Learnable Routing and QAT

본 논문은 기존 Sparse-Linear Attention (SLA)의 한계, 즉 주의 가중치 크기에 기반한 휴리스틱 기반의 어텐션 분할 과 희소 및 선형 어텐션 출력 간의 불일치 를 해결하는 것을 목표로 합니다.

#Review #Sparse-Linear Attention #Diffusion Models #Video Generation #Learnable Routing #Quantization-Aware Training #Attention Acceleration #Model Optimization

2026년 2월 18일

[논문리뷰] Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

본 논문은 텍스트, 이미지, 문서 이미지, 비디오 등 다양한 양식의 데이터를 통합 하여 고정밀 멀티모달 검색을 수행하는 Qwen3-VL-Embedding 및 Qwen3-VL-Reranker 모델 시리즈를 소개합니다.

#Review #Multimodal Retrieval #Multimodal Ranking #Foundation Models #Embedding Models #Reranking Models #Contrastive Learning #Knowledge Distillation #Matryoshka Representation Learning #Quantization-Aware Training

2026년 1월 11일