LAPO

技术

联合优化算法，将环境奖励同时作用于latent reasoning和action generation

3 次提及1 个连接首次出现: 2026-05-11最近出现: 2026-05-21

关系图谱

关系 (1)

使用技术 (1)

相关文章 (3)