A Survey of Network Attack Investigation Based on Provenance Graph
Investigating network attacks is crucial for the implementation of proactive defenses and the formulation of tracing countermeasures. With the rise of sophisticated and stealthy network threats,the need to develop efficient and au-tomated methods for investigations has become a pivotal aspect of advance intelligent network attack and defense capabili-ties. Existing studies have focused on modeling system audit logs into provenance graphs that represent causal dependencies of attack events. Leveraging the powerful associative analysis and semantic representation capabilities of provenance graphs,complex and stealthy network attacks can be effectively investigated,yielding superior results compared to conven-tional methods. This paper offers a systematic review of the literature on provenance-graph-based attack investigation,cate-gorizing the diverse methodologies into three principal groups:causality analysis,deep representation learning,and anoma-ly detection. For each category,the paper succinctly presents the workflows and the core frameworks that underpin these methodologies. Additionally,it delves into the optimization techniques for provenance graphs and chronicles the evolution of these technologies from theoretical constructs to their application in industrial settings. This study methodically aggre-gates and reviews datasets prevalently utilized in attack investigation research,offering a comprehensive comparative analy-sis of representative techniques alongside their associated performance metrics,specifically within the ambit of provenance graph-based methodologies. Subsequently,it delineates the prospective directions for future research and development with-in this specialized field,thereby providing a structured roadmap for advancing the domain's academic and practical applica-tions.