关于流动匹配模式的 GRPO 逐步信用分配
Stepwise Credit Assignment for GRPO on Flow-Matching Models
作者
Authors
暂无作者信息
期刊
Journal
暂无期刊信息
年份
Year
-
分类
Category
国家
Country
-
📝 摘要
Abstract
Flow-GRPO successfully applies reinforcement learning to flow models, but uses uniform credit assignment across all steps. This ignores the temporal structure of diffusion generation: early steps determine composition and content (low-frequency structure), while late steps resolve details and textures (high-frequency details). Moreover, assigning uniform credit based solely on the final image can inadvertently reward suboptimal intermediate steps, especially when errors are corrected later in th
📊 文章统计
Article Statistics
基础数据
Basic Stats
6
浏览
Views
0
下载
Downloads
0
引用
Citations
引用趋势
Citation Trend
阅读国家分布
Country Distribution
阅读机构分布
Institution Distribution
月度浏览趋势
Monthly Views