Efficient File Random Access Method Based on Primer Index Matrix in DNA Storage Scenarios
DNA molecules have the advantages of high density and stability,and are expected to become the medium for the next generation of massive data storage needs,which has received widespread attention in recent years.Currently,primers are used as the unique identifier for files,and random retrieval of DNA pool storage files can be achieved based on polymerase chain reaction(PCR)amplification technology.However,the allocation and mapping relationship between primers and files have not been thoroughly studied,and random allocation is still used to associate primers and files.This will lead to a decrease in the search efficiency of the target primer sequence,and saving the mapping relationship table between primers and files will cause a lot of data redundancy.In order to provide an efficient connection bridge between silicon-based computing devices and carbon-based storage systems,and effectively reduce the data redundancy caused by storing primer-file mapping relationships,a random retrieval method for DNA storage based on the primer index matrix is proposed in this paper.This method constructs a primer index matrix by dividing the stored file set according to different attributes of the file,and converts the primers in the primer library into an ordered primer library according to conversion rules.Finally,the mapping relationship between primers and files is optimized to achieve efficient and multi-dimensional retrieval during file random retrieval.The experimental results show that when storing file sets of different sizes,the efficiency of primer retrieval is improved to a constant level of time complexity by establishing the corresponding primer index matrix using the proposed algorithm in this paper,and the extra storage space required to store the mapping relationship between primers and files is optimized from linear growth to logarithmic growth.