A Data-Free Distillation Framework for Adaptive Bitrate Algorithms
Adaptive Bit-Rate(ABR)algorithm is a key technique in streaming video transmission.The algorithm selects an appropriate bit-rate for the next video chunk based on the current network conditionsand playback status to ensure a high-quality user experience(QoE).Among them,learning-based ABR algorithms,due to their characteristic of bypassing traditional modeling and learning strategies from scratch,have achieved better performance and gradually replaced heuristic ABR algorithms that require careful tuning,becoming a research hotspot in the field.However,these algorithms use neural network inference,which results in a large number of model parameters and high overall computational overhead,making it difficult to deploy them in real-world scenarios.Therefore,previous works proposed decision tree distillation schemes,which utilize lightweight decision trees to distill expert policies from learning-based ABR algorithms and deploy them online.However,the experiments in the paper show that the previous distillation framework overlooked the influence of training environments on the distilled policies,resulting in poor generalization capability.The paper proposes NIA(data-free Network-environmental Imitation-based rate Adaptation framework),a novel data-free distillation framework for generating decision tree ABR algorithms with better generalization.NIA generates and selects suitable artificial network environments for distilling decision tree policies.Specifically,NIA uses a network environment generation module to construct multiple artificial network environments and,before each iteration of training,leverages an environment selection module to choose appropriate network scenarios.Prior to each training iteration,the module chooses suitable network environments for distilling teacher policies using a student model.This process is modeled as a"no-regret online learning"problem in reinforcement learning.In detail,the module utilizes the Upper Confidence Bound(UCB)algorithm with the Top-K approach to select network environments from the pool generated by the network environment generation module based on the current performance indicators of the student model and the environment.The selection process ensures performance improvement while maintaining lower bounds.NIA then interacts with the selected scenario to complete the decision tree distillation process based on a student-driven imitation learning algorithm.Based on the observed current state,the student model selects an appropriate bit-rate,while the state is input to the teacher model to obtain expert policies.Subsequently,the module distills the policies by reducing the distance between the student model's strategy and the expert strategy.The paper designs a comprehensive evaluation platform to test the performance of NIA.The data demonstrates that NIA exhibits good QoE performance and generalization performance on various bandwidth datasets:(1)Compared to heuristic algorithms,it improves QoE metrics by 1%to 46%;(2)Compared to previous decision tree distillation schemes,it performs equally well in low-band-width scenarios but achieves nearly twice the improvement in high-bandwidth scenarios;(3)Its overall performance is close to or even surpasses that of learning-based algorithms(i.e.,expert policies).Finally,the paper carefully analyzes and compares the important parameter settings within the three modules of NIA,including the impact of student model and network environment parameters on NIA's performance,the number of environment generations,the Top-K ratio,the selection algorithm,and the exploration index in the algorithm.Through extensive experiments,each parameter of NIA is appropriately set within the correct range to achieve better results.
video streamingadaptive bitrate algorithmdata-free distillation