본문으로 건너뛰기

#Instruction Tuning

45개의 포스트

[논문리뷰] LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

댓글 수 로딩 중

[논문리뷰] Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning

댓글 수 로딩 중

[논문리뷰] Diffutron: A Masked Diffusion Language Model for Turkish Language

댓글 수 로딩 중

[논문리뷰] Mario: Multimodal Graph Reasoning with Large Language Models

댓글 수 로딩 중

[논문리뷰] A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn't)

댓글 수 로딩 중

[논문리뷰] Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions

댓글 수 로딩 중

[논문리뷰] Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

댓글 수 로딩 중

[논문리뷰] Typhoon-S: Minimal Open Post-Training for Sovereign Large Language Models

댓글 수 로딩 중

[논문리뷰] FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs

댓글 수 로딩 중

[논문리뷰] Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

댓글 수 로딩 중

[논문리뷰] JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

댓글 수 로딩 중

[논문리뷰] Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs

댓글 수 로딩 중

[논문리뷰] AyurParam: A State-of-the-Art Bilingual Language Model for Ayurveda

댓글 수 로딩 중

[논문리뷰] Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation

댓글 수 로딩 중

[논문리뷰] EchoVLM: Dynamic Mixture-of-Experts Vision-Language Model for Universal Ultrasound Intelligence

댓글 수 로딩 중

[논문리뷰] Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale

댓글 수 로딩 중

[논문리뷰] Do What? Teaching Vision-Language-Action Models to Reject the Impossible

댓글 수 로딩 중

[논문리뷰] VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models

댓글 수 로딩 중

[논문리뷰] L^2M^3OF: A Large Language Multimodal Model for Metal-Organic Frameworks

댓글 수 로딩 중

[논문리뷰] EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis

댓글 수 로딩 중

[논문리뷰] VisCoder2: Building Multi-Language Visualization Coding Agents

댓글 수 로딩 중

[논문리뷰] PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity

댓글 수 로딩 중

[논문리뷰] Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

댓글 수 로딩 중

[논문리뷰] Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

댓글 수 로딩 중

[논문리뷰] Agentic Reinforcement Learning for Search is Unsafe

댓글 수 로딩 중