We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请问在PPO_model.py 文件里,forward 是空的,为什么可以通过evaluate 函数实现呢? 实在没搞懂 这样的话HGNNScheduler 网络里的 actor 和 critic 是怎么训练的? evaluate 函数,里面使用了 actor 和 critic ,那actor 网络的含义是什么啊? 初始化的输出只有1维,如何输出 action的 分布呢? 是通过里面的 act 和 get_action_prob 函数实现的? 那在test的时候还是通过 函数实现而不是 通过神经网络interfence得到结果的啊? 菜鸡懵逼ing
The text was updated successfully, but these errors were encountered:
虽然PPO_model文件里没有forward,但是evaluate()里面调用了“self.get_machines[i]”,“self.get_machines[i]”, “self.actor", "self.critic",这几个都是nn.Module的子类。actor是输出动作的,逼近”状态-动作“函数,critic计算值函数,计算优势函数用。
Sorry, something went wrong.
No branches or pull requests
请问在PPO_model.py 文件里,forward 是空的,为什么可以通过evaluate 函数实现呢? 实在没搞懂 这样的话HGNNScheduler 网络里的 actor 和 critic 是怎么训练的?
evaluate 函数,里面使用了 actor 和 critic ,那actor 网络的含义是什么啊? 初始化的输出只有1维,如何输出 action的 分布呢? 是通过里面的 act 和 get_action_prob 函数实现的?
那在test的时候还是通过 函数实现而不是 通过神经网络interfence得到结果的啊?
菜鸡懵逼ing
The text was updated successfully, but these errors were encountered: