Reinforcement learning is a machine learning method in which the machine learns by experimenting with positive and negative rewards. If an experiment is performed and the results are fruitful, then it is a positive reward; if the results are not as expected, then it is a negative reward.
Let's understand this with an example: When humans learn to ride a bicycle, they pedal the bicycle and it moves, which gives a positive experience, but when balance is not maintained, they fall, which is a negative experience. Hence, learning from experiences is known as "reinforcement learning."
Terms associated with reinforcement learning
- Agent : Entity that explores the environment and then act on it.
- Environment : Defined as surroundings in which the agent is surrounded by, which is random in general
- Action : Moves that are taken by agent within the environment
- State : Situation which is returned by environment after agent takes the action
- Reward : Feedback received from environment after agent has taken the action
- Policy : Strategy applied by agent for the next action basis current state
- Value : Expected long terms return of agent considering discount factor
- Q-Value : Similar to value but it also takes current action in account
No comments:
Post a Comment