본문으로 건너뛰기

#Prompt Engineering

48개의 포스트

[논문리뷰] MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

댓글 수 로딩 중

[논문리뷰] Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies

댓글 수 로딩 중

[논문리뷰] How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities

댓글 수 로딩 중

[논문리뷰] Model Context Protocol (MCP) Tool Descriptions Are Smelly! Towards Improving AI Agent Efficiency with Augmented MCP Tool Descriptions

댓글 수 로딩 중

[논문리뷰] Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

댓글 수 로딩 중

[논문리뷰] Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models

댓글 수 로딩 중

[논문리뷰] Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models

댓글 수 로딩 중

[논문리뷰] Are LLMs Vulnerable to Preference-Undermining Attacks (PUA)? A Factorial Analysis Methodology for Diagnosing the Trade-off between Preference Alignment and Real-World Validity

댓글 수 로딩 중

[논문리뷰] COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs

댓글 수 로딩 중

[논문리뷰] Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation

댓글 수 로딩 중

[논문리뷰] Structured Extraction from Business Process Diagrams Using Vision-Language Models

댓글 수 로딩 중

[논문리뷰] PromptBridge: Cross-Model Prompt Transfer for Large Language Models

댓글 수 로딩 중

[논문리뷰] Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information

댓글 수 로딩 중

[논문리뷰] SAM 3: Segment Anything with Concepts

댓글 수 로딩 중

[논문리뷰] Large Language Models Meet Extreme Multi-label Classification: Scaling and Multi-modal Framework

댓글 수 로딩 중

[논문리뷰] Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey

댓글 수 로딩 중

[논문리뷰] Do LLMs Feel? Teaching Emotion Recognition with Prompts, Retrieval, and Curriculum Learning

댓글 수 로딩 중

[논문리뷰] left|,circlearrowright,text{BUS},right|: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles

댓글 수 로딩 중

[논문리뷰] Vote-in-Context: Turning VLMs into Zero-Shot Rank Fusers

댓글 수 로딩 중

[논문리뷰] Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications

댓글 수 로딩 중

[논문리뷰] Leveraging Large Language Models for Predictive Analysis of Human Misery

댓글 수 로딩 중

[논문리뷰] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

댓글 수 로딩 중

[논문리뷰] PatenTEB: A Comprehensive Benchmark and Model Family for Patent Text Embedding

댓글 수 로딩 중

[논문리뷰] Deflanderization for Game Dialogue: Balancing Character Authenticity with Task Execution in LLM-based NPCs

댓글 수 로딩 중

[논문리뷰] LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens

댓글 수 로딩 중

[논문리뷰] Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs

댓글 수 로딩 중

[논문리뷰] Benchmark It Yourself (BIY): Preparing a Dataset and Benchmarking AI Models for Scatterplot-Related Tasks

댓글 수 로딩 중

[논문리뷰] Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

댓글 수 로딩 중

[논문리뷰] UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation

댓글 수 로딩 중

[논문리뷰] Emergent Misalignment via In-Context Learning: Narrow in-context examples can produce broadly misaligned LLMs

댓글 수 로딩 중

[논문리뷰] ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

댓글 수 로딩 중

[논문리뷰] BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses

댓글 수 로딩 중