首页|基于Spark云计算的生物基因多序列比对方法

基于Spark云计算的生物基因多序列比对方法

扫码查看
在生物基因多序列比对过程中,早期的方法仅计算了单一的Spark集群参数,导致算法的并行效果较差;为此,设计了基于Spark云计算的生物基因多序列比对方法;基于获得的生物遗传序列数据,对其进行了优化,并通过计算不同序列间的匹配度,对生物基因多序列比对任务进行动态规划;利用Spark云计算技术,构建Spark集群,并对多个Spark集群的参数进行计算;利用多种生物基因序列之间的相似性与差异性来选择最佳的匹配路径,在此基础上,建立多个生物基因序列比对的并行计算模型,并对其进行求解,得到对应的多个序列对比对的并行算法;实验结果表明:该方法具有更好的并行性,能够有效提高多序列比对的性能。
Multiple Sequence Alignment Method for Biological Genes Based on Spark Cloud Computing
In the multi sequence alignment process of biological genes,early algorithms only calculate a single Spark cluster pa-rameter,resulting in poor parallel performance of the algorithms.For this purpose,a multi sequence alignment parallel algorithm for biological genes based on Spark cloud computing was designed.The obtained biological genetic sequence data was optimized,and the dynamic planning of the biological gene multi sequence alignment was carried out by calculating the matching degree between different sequences.Spark cloud computing technology was used to build Spark clusters and calculate the parameters of multiple Spark clus-ters.By utilizing the similarities and differences between multiple biological gene sequences,the optimal matching path was selected.On this basis,the parallel computing model for multiple biological gene sequences was established and solved,and the corresponding parallel algorithm for aligning multiple sequences was obtained.Experimental results show that the algorithm has better parallelism and can effectively improve the performance of multiple sequence alignment.

spark cloud computingbiological genesbioinformaticsmultiple gene sequencesparallel algorithms

杨波、陈洋广、徐胜超

展开 >

广州华商学院数据科学学院,广州 511300

广州华商学院会计学院,广州 511300

Spark云计算 生物基因 生物信息学 基因多序列比对 并行算法

国家自然科学基金面上项目广州华商学院校内科研导师制项目资助

619724442023HSDS34

2024

计算机测量与控制
中国计算机自动测量与控制技术协会

计算机测量与控制

CSTPCD
影响因子:0.546
ISSN:1671-4598
年,卷(期):2024.32(7)
  • 4