面向大规模服务器的自动化安全运维方法
Automatic and safe operations method for large scale servers
王一达1
作者信息
- 1. 北京大学 计算机学院,北京 100091;阿里巴巴集团 云智能基础产品事业部,北京 100102
- 折叠
摘要
为在应用升级、系统配置变更等运维操作执行过程中保障应用的可用性和稳定性,提出一套运维平台和应用程序进行交互的Decider协议.使运维操作的影响能够被准确定义后传递给应用程序,应用程序能够根据影响值以及自身运行状态对运维操作的执行进行决策,运维过程变得高度自动化且安全可靠.模拟实验的结果表明,Decider协议能够有效提高运维任务的 自动化程度,保障运维过程中应用程序的稳定性.
Abstract
To remain the availability and stability of applications during operations execution like application updates and system environment updates,a protocol called Decider was provided and used in interactions between operations platform and applica-tions.With Decider protocol,the impact of operations was well defined and transferred to applications.Applications were able to control execution of operations according to the received impact value and the current status of themselves.Operations execution is highly automatic and safe.Results of the simulation experiments indicate that the automation level of operations is improved using Decider protocol and the application stability is also guaranteed during operations execution.
关键词
大规模数据中心/自动化运维/运维平台/协议/可用性/稳定性/安全运维Key words
large scale data centers/automatic operations/operations platform/protocol/availability/stability/safe operations引用本文复制引用
出版年
2024