In the last RL article, we learned about different terms associated with RL, such as action, environment, state, etc.
Today we will learn about how an agent observes the environment , then takes an action, and the environment rewards that action in terms of a positive or negative reward.
Supervised vs Unsupervised vs Reinforcement learning
In supervised learning, labelled data is present and a model is trained based on that, while in unsupervised learning, there is no labelled data and clusters or segments are created after the modelling.
In reinforcement learning, there is no labelled data, and the model learns things based on its own experiences and actions. The objective in RL is to maximize the cumulative rewards based on the sequence of actions.
There are two types of tasks
- Continuous: - Task which do not have a definite end. ( Ex. Learning to walk, driving a car)
- Episodic: - Task that have a definite end ( Ex. Games, Chess, Ludo etc.) - Here in the end outcome comes in terms of win or loss.
No comments:
Post a Comment