Few dynamic pricing algorithms when facing uncertain demand consider the consumers'strategic behavior.In this paper,the retailer's price decision was described as a multi-armed bandit(MAB)problem,and a non-parametric Bayesian algorithm was proposed.The algorithm combined Gaussian process regression with Thompson sampling algorithm,added strategic consumers'purchasing decision,and helped retailers to make price decisions.Simulation results show that the proposed algorithm can effectively improve retailers'revenue and converge faster.The presence of strategic consumers can improve the performance of the demand learning algorithm and reduce the loss of retailers'revenue due to the uncertainty of demand.
关键词
高斯过程回归/动态定价/策略型消费者/机器学习
Key words
Gaussian process regression/Dynamic pricing/Strategic consumer/Machine learning