首页|基于优化BWT索引技术的序列比对算法研究

基于优化BWT索引技术的序列比对算法研究

扫码查看
生物信息学中,大规模的生物基因序列比对是最重要的基础问题.针对主流的BWT(burrows-wheeler transform)索引技术的研究,提出一种新的多阶混合BWT索引方法MD-BWT(multi difference cover mod3 burrows-wheeler transform),根据待比对序列的长度,动态选取适合的多位索引查找.实验结果表明,改进后的方法可以有效减少序列比对算法中的比对次数和计算次数,降低序列比对算法中索引算法的时间复杂度,明显提高序列比对的效率.在构造BWT(S)字符串过程中,通过DC3(difference cover mod 3)算法来构造后缀数组,实验表明DC3算法构造后缀数组比倍增算法的时间复杂度更低,时间性能更优.
Research on Sequence Alignment Algorithm Based on Optimized BWT Indexing Technique
Large-scale gene sequence alignment is the most important fundamental problem in bioinformatics.Based on the mainstream research of BWT index technology,the paper proposes a new multi-order mixed BWT index method,which dynamically selects the appropiate multi-bit indexing according to the length of the sequence to be compared.The experimental results show that the improved method can effectively reduce the number of comparison and calculation times,reduce the time complexity of the index algorithm,and significantly improve the efficiency of sequence comparison.In the process of constructing BWT(S)string,this paper uses DC3(difference cover mod 3)algorithm to construct the suffix array.Experiments show that DC3 algorithm has better time performance compared to Binary Lifting.

sequence alignmentBWT indexdifference cover mod 3suffix array

胡春玲、赵俊杰、姚梦媛、高欢欢、朱艺杭、汪少鸿

展开 >

合肥大学人工智能与大数据学院,安徽 合肥 230031

郑州大学计算机与人工智能学院河南郑州 450001

长序列比对 BWT索引 DC3 后缀数组

2024

南京师范大学学报(工程技术版)
南京师范大学

南京师范大学学报(工程技术版)

影响因子:0.313
ISSN:1672-1292
年,卷(期):2024.24(4)