Report - SBEED:Convergent Reinforcement Learning with Nonlinear ... · it uses stochastic gradient descent to optimize the ob-jective, thus is very efficient and scalable. Furthermore, the

Please pass captcha verification before submit form