Unpacking the main arguments in reinforcement learning
Authors:
(1) Jongmin Lee, Department of Mathematical Sciences, Seoul National University;
(2) Ernest K. Ryu, Department of Mathematical Sciences, Seoul National University and Interdisciplinary Program in Artificial Intelligence, Seoul National University.
One summary and introduction
1.1 Notes and introductions
1.2 Previous works
2 Repeat the vertical value
2.1 Accelerating rate of Bellman consistency operator
2.2 The accelerating rate of Bellman’s ideal opera
3 Convergence when y=1
4 Minimum complexity
5 Repeat the approximate vertical value
6 Gauss-Seidel iteration of the established value
7 Conclusion, acknowledgments, funding disclosure, and references
A preliminary
B- Evidence omitted in Section 2
C- Evidence omitted in Section 3
D- Evidence omitted in Section 4
E – Evidence omitted in Section 5
F- Evidence omitted in Section 6
Wider impacts
h restrictions
C- Evidence omitted in Section 3
First, we introduce the following lemma.
Where the second inequality comes from the non-expansion of T.
Now we present the proof of the third theorem.
Next, we prove Theorem 4.
This paper is available on arxiv under a CC BY 4.0 DEED license.