基于非稳态加性噪声模型的因果发现算法

Causal Discovery Algorithm Based on Non-stationary Additive Noise Model

扫码查看

原文链接

维普
万方数据

中文摘要：因果发现旨在通过观测数据挖掘变量间的因果关系.现有的因果发现方法大多假定数据的产生过程是平稳的,然而在实际环境下往往不满足稳态假设,导致结果不可靠.研究发现,在一些场景中的非稳态扰动与时序信息高度相关.因此,在加性噪声模型基础上将非稳态扰动刻画为一项关于时序信息的函数,设计非稳态加性噪声模型,并给出非稳态加性噪声模型的识别条件,提出一种两阶段的因果关系学习算法.第1阶段利用回归计算得到变量残差,再检验残差与回归特征集的独立性从而选出叶子节点,迭代得到观测变量集的因果次序;第2阶段再次进行回归计算和独立性检验,消除第1阶段中冗余的因果关系,从而得到观测变量集的因果结构.实验结果表明,与基于约束的异构/非平稳因果发现、LPCMCI和TiMINo算法相比,该算法在仿真数据集上取得了最优的效果,平均F1值达到0.85;而在真实因果结构数据集中,该算法的F1值平均提升41.12％,能够从非稳态数据集中恢复出更多因果结构的信息.

外文摘要：Causal discovery aims to mine the causal relationship between variables through observed data.Most existing methods assume that the data-generation process is stationary.However,this assumption is often not satisfied in the application environments,leading to unreliable results.This study reveals that non-stationary disturbances in some scenes are highly correlated with time-series information.Therefore,based on the additive noise model,the method portrays non-stationary disturbances as a mapping of time series information and proposes a non-stationary additive noise model and its identification conditions.This study proposes a two-stage causality discovery algorithm based on identification conditions.Specifically,residuals are obtained through regression analysis and are used to evaluate the independence of selecting a leaf node in the initial phase of the algorithm.The causal order of the observed variable sets is thereafter obtained iteratively until all the variables have been included.In the second phase of the algorithm,regression analysis and independence tests are performed again to eliminate redundant causal relationships identified in the first stage,which helps to obtain the final causal structure of the observed variable set.Experimental results demonstrate that the proposed algorithm outperforms other algorithms such as Constraint-based causal Discovery heterogeneous/Non-stationary Data(CD-NOD),LPCMCI,and TiMINo.For the synthetic datasets,the proposed algorithm achieves an average Fl value of 0.85.In real-world structural datasets,the F1 value of the proposed algorithm increases by an average of 41.12％,signifying that the algorithm can learn more information about the causal structure from a dataset of non-stationary variables.

外文关键词：

causal discoverycausal structurenon-stationary disturbancesadditive noise modelfunctional causal model

作者：

郝志峰、丁凯培、蔡瑞初、陈薇

展开 >

作者单位：

广东工业大学计算机学院,广东广州 510006

汕头大学理学院,广东汕头 515063

关键词：

因果发现因果结构非稳态扰动加性噪声模型函数式因果模型

基金：

国家自然科学基金国家自然科学基金国家自然科学基金科技创新新一代人工智能重大项目(2030)国家优秀青年科学基金

项目编号：

6187604361976052622060642021ZD011150162122022

出版年：

2024

DOI：

10.19678/j.issn.1000-3428.0066901

计算机工程

华东计算技术研究所　上海市计算机学会

计算机工程

CSTPCD北大核心

影响因子：0.581

ISSN：1000-3428

年,卷(期)：2024.50(4)

参考文献量26