no code implementations • 29 Nov 2015 • Frederik Ruelens, Bert Claessens, Salman Quaiyum, Bart De Schutter, Robert Babuska, Ronnie Belmans
A wellknown batch reinforcement learning technique, fitted Q-iteration, is used to find a control policy, given this feature representation.
no code implementations • 8 Apr 2015 • Frederik Ruelens, Bert Claessens, Stijn Vandael, Bart De Schutter, Robert Babuska, Ronnie Belmans
We propose a model-free Monte-Carlo estimator method that uses a metric to construct artificial trajectories and we illustrate this method by finding the day-ahead schedule of a heat-pump thermostat.