基于混沌工程打造韧性架构
Build Resilient System Architecture Based on Chaos Engineering Platform
卢海波1
作者信息
- 1. 芒果TV产品技术中心,湖南长沙 410000
- 折叠
摘要
由于业务和技术复杂度不断提升,系统面临的不可控风险越来越高,线上故障发生的时间和范围无法预测,故障发生后对系统的影响难以评估,这些因素极大制约了线上服务的稳定性和业务的可用性.即使无法保证系统运行在无差错环境中,也要尽量在各种异常情况下保持良好的用户体验.混沌工程通过主动制造不稳定因素,验证和推动系统在面对失控条件时的故障恢复能力,最终实现韧性架构[1].
Abstract
Due to the increasing complexity of business and technology,the uncontrollable risks faced by the system are becoming increasingly high.The time and scope of online failures cannot be predicted,and the impact of failures on the system is difficult to evaluate.These factors greatly restrict the stability of online services and the availability of business.Even if it is not possible to ensure that the system operates in an error free environment,it is still necessary to maintain a good user experience in various abnormal situations.Chaos engineering is to actively create unstable factors,verify and promote the system's fault recovery ability in the face of uncontrollable conditions,and ultimately achieve a resilient architecture[1].
关键词
混沌工程/韧性架构/故障恢复Key words
chaos engineering/resilient system architecture/failure recovery引用本文复制引用
出版年
2024