文献检索-宁波市创意产业特色资源库

限定检索结果

检索条件"主题词=long episode"

共 1 条记录，以下是1-10 订阅

全选清除本页清除全部题录导出标记到"检索档案"

视图：

排序：

Reward Function Design Method for long episode Pursuit Tasks Under Polar Coordinate in Multi-Agent Reinforcement Learning: 收藏
分享
引用; 《Journal of Shanghai Jiaotong university(Science)》2024年第4期29卷 646-655页; 作者：DONG Yubo CUI Tao ZHOU Yufan SONG Xun ZHU Yue DONG PengSchool of Aeronautics and AstronauticsShanghai Jiao Tong UniversityShanghai200240China Beijing Institute of Electronic System EngineeringBeijing100854China; Multi-agent reinforcement learning has recently been applied to solve pursuit ***,it suffers from a large number of time steps per training episode,thus always struggling to converge effectively,resulting in low rewar...; Multi-agent reinforcement learning has recently been applied to solve pursuit ***,it suffers from a large number of time steps per training episode,thus always struggling to converge effectively,resulting in low rewards and an inability for agents to learn *** paper proposes a deep reinforcement learning(DRL)training method that employs an ensemble segmented multi-reward function design approach to address the convergence problem mentioned *** ensemble reward function combines the advantages of two reward functions,which enhances the training effect of agents in long ***,we eliminate the non-monotonic behavior in reward function introduced by the trigonometric functions in the traditional 2D polar coordinates observation *** results demonstrate that this method outperforms the traditional single reward function mechanism in the pursuit scenario by enhancing agents’policy scores of the *** ideas offer a solution to the convergence challenges faced by DRL models in long episode pursuit problems,leading to an improved model training performance.; 来源：详细信息评论

全选清除本页清除全部题录导出标记到“检索档案”

共1页<< <1> >>

聚类工具回到顶部