摘要
针对时序数据中因果关系检测算法效率低、错误率高、可解释性低的问题,本文提出一种新颖的用于时序数据的因果关系检测模型.该模型整合了泛函贪婪等价搜索(F-GES)模型与格兰杰(Granger)因果关系模型,展开因果关系的抽取和推断,并提出了因果关系可视分析方法,以交互式地分析时序数据中变量间的因果关系.可视分析方法形成了参数视图用于提高因果关系探索效率、因果关系树图用于直观有效地展示变量之间的因果关系、时间视图用于比较原始时序数据、堆叠流图用于帮助用户探索时序数据的层次演变以及平行坐标图用于进行相关性分析.基于真实数据形成的原型系统交互式地验证和总结时序数据中的因果关系,从而更高效地挖掘和理解时序数据中变量之间蕴含的因果规律以帮助决策.
Abstract
As data storage technology is increasingly improving,the correlations of variables in time series data are more complex.It is difficult to artificially speculate on the causalities based on previous accumulated experience to sup-port the exploration of deeper relationships.The use of machine algorithms to detect the causality between multivari-ate time series data and exert the potential value of data has important practical significance for the application of big data in marketing and health care.Aiming at low efficiency issues,high error rate and low interpretability of causality models in time series data,this paper combines the functional greedy equivalence search(F-GES)model with the Granger causality model for causal inference,and proposes an interactive causality visual analysis ap-proach,which includes the parameter view to improve the efficiency of causality exploration,the causality tree to visually display the causalities,the time view to compare the original time series data,and the streamgraph view for users to explore the hierarchical evolution of raw dataset,and parallel coordinate to analyze correlations among vari-ables.This system supports interactive visual manipulation,verification,and summarization of causal relationships in time series data.Thus,mining causalities between variables in time series data can help users for decision-mak-ing.
基金项目
教育部人文社会科学规划课题(22YJA840004)