Q-learning-based Algorithm for Orchestrating Security Service Function Chain
With the development of technology,Internet is becoming an indispensable part of human life and network security is becoming particularly important.To ensure network security,the orchestration of dynamic security service function chains is an important research direction.However,current research on network resource mapping and orchestration algorithms for dynamic security service function chains mainly focuses on a specific type of network resource,with the main goal of optimizing a certain network resource and reducing network service latency.They overlook the balance of overall resource allocation in the network.We construct a physical network model and a security service function chain model.Considering both physical network node com-puting resources and link bandwidth resources while meeting user needs,the goal is to achieve the best-balanced allocation of network resources.Based on the reinforcement Q-learning algorithm,a new link arrangement reward method is proposed,and a greedy strategy is introduced to avoid falling into local optima.A typical physical network model and different numbers of security service function chains that needs to be arranged are selected and the optimal arrangement path of the security service function chain is obtained through multiple iterations.The simulation results show that the optimal arrangement of the proposed security service function chain reduces the arrangement response time by 38.5%and improves the resource allocation balance by 2.1%compared to the simulated annealing algorithm.Compared with a genetic algorithm,it reduces the response time of orchestration by 96.5%and improves the balance of resource allocation by 2.9%.
network securitysecurity service function chainQ-learninggreedy strategyresource allocation