본문으로 건너뛰기

#Pre-training

19개의 포스트

[논문리뷰] Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

댓글 수 로딩 중

[논문리뷰] SageBwd: A Trainable Low-bit Attention

댓글 수 로딩 중

[논문리뷰] Xiaomi-Robotics-0: An Open-Sourced Vision-Language-Action Model with Real-Time Execution

댓글 수 로딩 중

[논문리뷰] OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

댓글 수 로딩 중

[논문리뷰] STEP3-VL-10B Technical Report

댓글 수 로딩 중

[논문리뷰] Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

댓글 수 로딩 중

[논문리뷰] TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior

댓글 수 로딩 중

[논문리뷰] Openpi Comet: Competition Solution For 2025 BEHAVIOR Challenge

댓글 수 로딩 중

[논문리뷰] On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

댓글 수 로딩 중

[논문리뷰] Diffusion Language Models are Super Data Learners

댓글 수 로딩 중

[논문리뷰] Universal Image Restoration Pre-training via Masked Degradation Classification

댓글 수 로딩 중

[논문리뷰] Memory Retrieval and Consolidation in Large Language Models through Function Tokens

댓글 수 로딩 중