#Partial Observability

6개의 포스트

[논문리뷰] Learning A Unified Risk Map for Autonomous Driving in Partially Observable Environments

본 논문은 자율주행 환경에서 시야가 차단된(partially observable) 환경에서의 인지 불확실성과 이로 인한 주행 전략 수립의 한계를 해결하고자 합니다.

#Review #Autonomous Driving #Partial Observability #Risk Map #Diffusion Model #Occlusion-Aware Prediction #Trajectory Planning

2026년 5월 28일

[논문리뷰] Learning POMDP World Models from Observations with Language-Model Priors

본 연구는 잠재 상태에 대한 정보(Ground-truth state)가 주어지지 않는 완전한 부분 관측 환경(Strict POMDP setting)에서 에이전트가 어떻게 효과적으로 세계 모델(World Model)을 학습할 수 있는지 탐구합니다.

#Review #POMDP #World Model #Large Language Models #Program Induction #Sample Efficiency #Partial Observability #Belief-based Filtering

2026년 5월 17일

[논문리뷰] IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation

본 논문은 프레임 단위로만 조건을 부여하는 기존 VLA 모델들이 부분 관측성(Partial Observability) 하에서 발생하는 짧은 기간의 의도 모호성 문제를 해결하지 못한다는 점을 지적합니다.

#Review #Vision-Language-Action (VLA)#Robot Manipulation #AliasBench #Short-Horizon Intent #Imitation Learning #Inter-chunk Consistency #Partial Observability

2026년 5월 14일

[논문리뷰] Next Embedding Prediction Makes World Models Stronger

부분적으로 관측 가능하고 고차원적인 환경에서 모델 기반 강화 학습(MBRL) 에이전트의 장기적인 시간 종속성 포착 능력 을 개선하는 것이 목표입니다.

#Review #Model-Based Reinforcement Learning #World Models #Decoder-Free #Temporal Transformer #Next-Embedding Prediction #Latent Representation #Partial Observability #Barlow Twins

2026년 3월 3일

[논문리뷰] World Models for Policy Refinement in StarCraft II

본 논문은 StarCraft II (SC2) 와 같이 복잡하고 부분 관측 가능한(partially observable) 실시간 전략(RTS) 게임 환경에서 대규모 언어 모델(LLM) 기반 에이전트 의 정책 결정 능력을 개선하는 것을 목표로 합니다.

#Review #StarCraft II #World Model #Policy Refinement #Large Language Models #Reinforcement Learning #Partial Observability #Structured Text Representation #Game AI

2026년 2월 19일

[논문리뷰] Sotopia-RL: Reward Design for Social Intelligence

본 논문은 대규모 언어 모델(LLM)을 사회적으로 지능적인 에이전트로 훈련할 때 직면하는 부분적 관측성(Partial Observability) 과 다차원성(Multi-dimensionality) 이라는 핵심 과제를 해결하고자 합니다.

#Review #Social Intelligence #Reinforcement Learning #Reward Design #Large Language Models #Utterance-level Rewards #Multi-dimensional Rewards #Partial Observability #SOTOPIA

2025년 8월 7일