
Typical algorithms for solving reinforcement learning (RL) problems, are built on an assumption of a stationary environment (modeled as a stationary MDP), meaning the agent is learning how to act in an environment in which the action chosen in each state is not time dependent. However, one can think of many everyday life problems that occur in non-stationary environments, which change over time. Such problems were discussed in former articles...
Categories:
Machine Learning