首页|面向NewSQL数据库数据协同持久化的研究

面向NewSQL数据库数据协同持久化的研究

扫码查看
现代NewSQL数据库为了提供数据的高可用性,通常会为数据提供多个副本,以便在某个副本不可用时,可以从其他的副本中获取数据.而在数据多副本的情况下,又需要考虑副本间的数据一致性问题,即在某一时刻不同客户端读取某个数据时得到的结果应该是相同的,因此引入了事务处理机制.在一个包含多个写操作的交互式事务处理过程中,由于数据存在多个副本,因此每个写入操作需要对所有的主备副本进行写入操作.然而主备副本通常分散在不同的机器上,因此会引入写远端副本的时延,其最终将会增大整个事务的处理时延.针对该问题,提出了数据协同持久化的方案,其主要思想是让客户端在本地缓存事务的写操作日志,在最终提交事务时,客户端首先将事务中的写操作日志进行持久化,并将该日志发送给事务的协调者节点,让协调者进行日志数据的分发处理,从而达到两者协同持久化事务数据的目的.实验结果表明,相较于同步持久化方案,协同持久化方案不仅能降低交互式事务处理的时延,还能提高约38%左右的系统极限吞吐率.
Study on Collaborative Data Persistence in NewSQL Databases
To ensure high availability of data,modern NewSQL databases often create several copies of data so that it can be ac-cessed from other copies in case one copy is not available.With multiple data copies,it is essential to consider data consistency be-tween them.This means that the results should be the same when different clients read the same data at a particular moment.Therefore,a transaction processing mechanism is introduced.In the interactive transactional process with multiple write opera-tions,each write operation must be performed on both the primary and backup copies of the data,since there are multiple copies.However,the primary and backup replicas are typically located on different machines,resulting in increased latency when writing to remote replicas,which in turn can ultimately lead to an increase in the processing latency of the entire transaction.In this pa-per,we present a collaborative data persistence scheme where the client caches the transaction write logs locally.When the trans-action is finally committed,the client firstly persists the write logs of the transaction and sends the logs to the coordinator node of the transaction to allow the coordinator to distribute the log data,so as to achieve the purpose of the two cooperating in persist-ence of the transaction data.Experimental results show that in comparison to the synchronous persistence scheme,cooperative persistence scheme can not only reduce the latency of interactive transaction processing,but also improve the system limit throughput rate by roughly 38%.

Distributed databaseConcurrency controlData persistenceData consistencyHigh-contention workload

左顺、李永坤、许胤龙

展开 >

中国科学技术大学计算机科学与技术学院 合肥 230026

安徽省高性能计算重点实验室 合肥 230026

分布式数据库 并发控制 数据持久化 数据一致性 高数据竞争负载

2025

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2025.52(1)