본문으로 건너뛰기

#Generative AI

83개의 포스트

[논문리뷰] Explainable Disentangled Representation Learning for Generalizable Authorship Attribution in the Era of Generative AI

댓글 수 로딩 중

[논문리뷰] Seedance 2.0: Advancing Video Generation for World Complexity

댓글 수 로딩 중

[논문리뷰] Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

댓글 수 로딩 중

[논문리뷰] Accelerating Diffusion via Hybrid Data-Pipeline Parallelism Based on Conditional Guidance Scheduling

댓글 수 로딩 중

[논문리뷰] SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models

댓글 수 로딩 중

[논문리뷰] AAVGen: Precision Engineering of Adeno-associated Viral Capsids for Renal Selective Targeting

댓글 수 로딩 중

[논문리뷰] Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models

댓글 수 로딩 중

[논문리뷰] Geometry-Aware Rotary Position Embedding for Consistent Video World Model

댓글 수 로딩 중

[논문리뷰] FireRed-Image-Edit-1.0 Techinical Report

댓글 수 로딩 중

[논문리뷰] QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

댓글 수 로딩 중

[논문리뷰] Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

댓글 수 로딩 중

[논문리뷰] DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal

댓글 수 로딩 중

[논문리뷰] DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

댓글 수 로딩 중

[논문리뷰] SemanticGen: Video Generation in Semantic Space

댓글 수 로딩 중

[논문리뷰] Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

댓글 수 로딩 중

[논문리뷰] 3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework

댓글 수 로딩 중

[논문리뷰] Kling-Omni Technical Report

댓글 수 로딩 중

[논문리뷰] NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation

댓글 수 로딩 중

[논문리뷰] Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

댓글 수 로딩 중

[논문리뷰] MRI Super-Resolution with Deep Learning: A Comprehensive Survey

댓글 수 로딩 중

[논문리뷰] Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

댓글 수 로딩 중

[논문리뷰] Controllable Layer Decomposition for Reversible Multi-Layer Image Generation

댓글 수 로딩 중

[논문리뷰] Loomis Painter: Reconstructing the Painting Process

댓글 수 로딩 중

[논문리뷰] A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

댓글 수 로딩 중

[논문리뷰] EVTAR: End-to-End Try on with Additional Unpaired Visual Reference

댓글 수 로딩 중

[논문리뷰] Let Multimodal Embedders Learn When to Augment Query via Adaptive Query Augmentation

댓글 수 로딩 중

[논문리뷰] RiddleBench: A New Generative Reasoning Benchmark for LLMs

댓글 수 로딩 중

[논문리뷰] UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings

댓글 수 로딩 중

[논문리뷰] HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models

댓글 수 로딩 중

[논문리뷰] 3D Aware Region Prompted Vision Language Model

댓글 수 로딩 중

[논문리뷰] InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis

댓글 수 로딩 중

[논문리뷰] Jointly Reinforcing Diversity and Quality in Language Model Generations

댓글 수 로딩 중

[논문리뷰] Dress&Dance: Dress up and Dance as You Like It - Technical Preview

댓글 수 로딩 중

[논문리뷰] TempFlow-GRPO: When Timing Matters for GRPO in Flow Models

댓글 수 로딩 중

[논문리뷰] OmniTry: Virtual Try-On Anything without Masks

댓글 수 로딩 중

[논문리뷰] StyleMM: Stylized 3D Morphable Face Model via Text-Driven Aligned Image Translation

댓글 수 로딩 중

[논문리뷰] ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

댓글 수 로딩 중

[논문리뷰] A Survey on Diffusion Language Models

댓글 수 로딩 중

[논문리뷰] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

댓글 수 로딩 중

[논문리뷰] Personalized Safety Alignment for Text-to-Image Diffusion Models

댓글 수 로딩 중

[논문리뷰] The Principles of Diffusion Models

댓글 수 로딩 중

[논문리뷰] Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation

댓글 수 로딩 중

[논문리뷰] Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models

댓글 수 로딩 중

[논문리뷰] Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs

댓글 수 로딩 중

[논문리뷰] D^3QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection

댓글 수 로딩 중

[논문리뷰] AInstein: Assessing the Feasibility of AI-Generated Approaches to Research Problems

댓글 수 로딩 중

[논문리뷰] How Confident are Video Models? Empowering Video Models to Express their Uncertainty

댓글 수 로딩 중

[논문리뷰] Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap

댓글 수 로딩 중