#Forward-Process RL

1개의 포스트

[논문리뷰] Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

Distilled autoregressive (AR) video models는 efficient streaming generation을 가능하게 하지만, 종종 human visual preferences와 misalign되어 artifacts나 unnatural motion dynamics를 보입니다.

#Review #Video Generation #Distilled Autoregressive Models #Reinforcement Learning (RL)#Human Preferences #Streaming Generation #Forward-Process RL #Reward Hacking #Temporal Consistency

2026년 3월 22일