<aside> ✨

</aside>

Basic Info

<aside> ✨

RL technique used to find the optimal policy in a MDP

</aside>

How does Q-Learning work (cricket example)

<aside> ✨

➡️ This approach is called value iteration .

Example

image.png

➡️ INITIALLY

➡️ Q-Table

Fin

<aside> ✨

Notes by : Mehul (mehul.xyz)

</aside>