Multi-stage Bayesian Reinforcement Learning Robust Portfolio Selection Model
The estimation of uncertainty sets in traditional multi-stage distributionally robust portfolio selection models is a challenging problem.This paper applys the Bayesian reinforce-ment learning technique to dynamically update the first two order moments in the uncertainty sets of a multi-stage distributionally robust model.We study the mean-worst case robust CVaR model in the Bayesian reinforcement learning framework.We propose a two-level decomposition solution framework by combining dynamic programming techniques and the progressive hedg-ing algorithm.The lower level finds optimal policies of sub-models with given model parameters by solving a series of second-order cone programming problems.While the upper level finds an implementable policy satisfying non-anticipation constraints by using Bayes'law.Numerical results in the US stock market illustrate the superior out-of-sample investment performance of the multi-stage Bayesian reinforcement learning robust portfolio selection model.