〒223-8522 横浜市港北区日吉3-14-1
abstract:
I'd like to briefly review mathematical basics of reinforcement learning including the setup, the Markov Decision Process, and three training schemes, value-based algorithm, policy-based algorithm, and hybrid of them so-called actor-critic algorithm.