Optimal BN Structure Learning Algorithm Based on Double Constraints
陈艺薇 1邸若海 1王鹏 1张新兰 2张欢 2许文2
扫码查看
点击上方二维码区域,可以放大扫码查看
作者信息
1. 西安工业大学电子信息工程学院,陕西西安 710021
2. 航天材料及工艺研究所,北京 100076
折叠
摘要
针对现有基于动态规划的贝叶斯网络结构学习算法复杂度高、无法在合理时间内学习大规模网络的问题,提出基于双重约束的最优贝叶斯网络(Bayesian Network,BN)结构学习算法.首先,利用最大信息系数和马尔科夫毯限制条件独立性(Conditional Independence,CI)测试的候选节点集合和约束集,得到邻居节点集合;其次,利用邻居节点集合约束父节点图的搜索过程,得到候选父节点集合,从候选父节点集合中取出每个节点的最优父集构造初始有向图;再次,利用Tarjan算法计算初始有向图中的强连通分量,得到节点块序;最后,利用节点块序约束节点序图的搜索过程,获得最优的BN结构.实验表明,相比于现有的5种基于动态规划的结构学习算法,本文提出的算法在精度稍微降低的前提下,极大幅度提高了算法的学习效率,如Sachs网络,本文提出的算法相对DPCMB(Dynamic Programming Constrained with Markov Blanket)算法降低了40.3%的时耗,算法精度下降了12.1%.
Abstract
Aiming at the problem that existing Bayesian network (BN) structure learning algorithms based on Dynam-ic programming are too complex to learn large-scale networks within a reasonable time,a Bayesian network structure learn-ing algorithm based on double constraints is proposed. Firstly,the set of neighbor nodes is obtained by using the set of can-didate nodes and constraint set for conditional independence (CI) tests based on the maximum information coefficient and Markov blanket. Secondly,the neighbor node set is used to constrain the search process of the parent node graph,so as to obtain the candidate parent node set. On this basis,the optimal parent set of each node is extracted from the candidate parent node set to construct the initial directed graph. Thirdly,the strongly connected components of the initial digraph are calculat-ed using the Tarjan algorithm to get the node block order. Finally,the optimal BN structure is obtained by using node block order to constrain the search process of node order graph. Experiments show that,compared with the existing five structural learning algorithms based on dynamic programming,the algorithm proposed in this paper greatly improves the learning effi-ciency of the algorithm under the premise of slightly reduced accuracy. For Sachs network,the proposed algorithm reduces the time consumption by 40.3% and the accuracy by 12.1% compared with DPCMB (Dynamic Programming Constrained with Markov Blanket) algorithm.
关键词
贝叶斯网络/最大信息系数/条件独立性测试/马尔科夫毯
Key words
Bayesian network/maximum information coefficient/conditional independence test/Markov blanket