项目作者: CSKrishna
项目描述 :
We use policy gradient to help agents learn optimal policies in a competitive multi-agent contextual bandit setting
高级语言: Jupyter Notebook
项目地址: git://github.com/CSKrishna/Optimal-bidding-policy-using-Policy-Gradient-in-a-Multi-agent-Contextual-Bandit-setting.git