Adaptive Execution: Exploration and Learning of Price Impact

26 Jul 2012  ·  Beomsoo Park, Benjamin Van Roy ·

We consider a model in which a trader aims to maximize expected risk-adjusted profit while trading a single security. In our model, each price change is a linear combination of observed factors, impact resulting from the trader's current and prior activity, and unpredictable random effects. The trader must learn coefficients of a price impact model while trading. We propose a new method for simultaneous execution and learning - the confidence-triggered regularized adaptive certainty equivalent (CTRACE) policy - and establish a poly-logarithmic finite-time expected regret bound. This bound implies that CTRACE is efficient in the sense that the ({\epsilon},{\delta})-convergence time is bounded by a polynomial function of 1/{\epsilon} and log(1/{\delta}) with high probability. In addition, we demonstrate via Monte Carlo simulation that CTRACE outperforms the certainty equivalent policy and a recently proposed reinforcement learning algorithm that is designed to explore efficiently in linear-quadratic control problems.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here