New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

PPO的Critic输入加入Action #73

Open

Hanbinbinbin opened this issue Oct 12, 2021 · 1 comment

Assignees

Hanbinbinbin commented Oct 12, 2021

PPO的代码中如果将action加入到critic网络的输入向量中，只将action作为一维直接加入，训练效果很差；之后将state的四个值归一化处理后效果还是不好。
请问在critic网络输入加入action后对输入怎么处理能使训练得到较好的效果？

感谢作者提供的代码

qiwang067 assigned johnjim0816

severus98 commented Sep 3, 2024

ppo的critic不是用来估计状态值V(s)的吗，为啥要加入action？那成了DDPG吧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment