Using Smart Devices for System-level Management and Control in the Smart grid: A Reinforcement Learning Framework


This paper presents a stochastic modeling framework to employ adaptive control strategies in order to provide short term ancillary services to the power grid by using a population of heterogenous thermostatically controlled loads. A classical Markov Decision Process (MDP) representation is developed to leverage existing tools in the field of reinforcement learning. Initial considerations and possible reductions in the action and state spaces are described. A Q-learning approach is implemented in simulation to demonstrate the performance of the presented adaptive control framework on a reference tracking scenario.