The success of the Human Genome Project has significantly deepened our understanding of genomics and catalyzed a growing focus on proteomics,as researchers aim to decipher the complex relationship between genes and proteins.Given the central role of proteins in regulating physiological processes—including DNA replication,metabolic reactions,signal transduction,pH balance,and cellular structure—developing advanced protein sequencing technologies is critical.Proteins are fundamental to nearly all biological activities,making their detailed study essential for understanding cellular functions and disease mechanisms.The Edman degradation method,developed in the 1950s,was a breakthrough in sequencing short peptides.However,its limitations in read length(fewer than 50 amino acids)and slow cycle time fall short of modern demands.Mass spectrometry has since emerged as the gold standard in protein sequencing due to its high accuracy,throughput,and reproducibility.The method is enhanced by a robust sample preparation workflow and advances in mass spectrometry technology.Despite these strengths,mass spectrometry faces limitations in dynamic range,sensitivity,read length,and sequence coverage,hindering complete de novo protein sequencing.These technological gaps underscore the need for innovative methods to provide more detailed and accurate protein sequence data.In the past decade,new protein sequencing methods,including tunneling current,fluorescence fingerprinting,and real-time dynamic fluorescence,have shown significant developmental potential.However,these methods are not yet ready for widespread application,as each still faces technical hurdles.Meanwhile,advances in nanopore DNA sequencing have sparked interest in applying nanopore technology to protein sequencing,particularly owing to its speed,convenience,and cost-effectiveness.Unlike DNA sequencing,protein sequencing presents greater challenges due to proteins'complex three-dimensional structures,heterogeneous electrical charges,difficulties in directional movement,and diverse amino acid compositions,further complicated by post-translational modifications.Researchers have made significant strides in addressing these challenges,such as using unfolding enzymes,high temperatures,high voltage,and deformers to unravel protein structures,and employing charged sequences and electroosmotic flow to control peptide translocation.The latest strategies for nanopore protein sequencing can be broadly categorized into three approaches:strand sequencing,enzyme-assisted nanopore sequencing,and nanopore fingerprinting.In strand sequencing,dragging a protein-oligonucleotide conjugate through a nanopore with the aid of protein motors generates stepped current signals produced by the peptide strand.In enzyme-assisted nanopore sequencing,20 proteinogenic amino acids and various post-translational modifications have been distinguished using nanopores,and sequencing of short peptides has also been demonstrated.In nanopore fingerprinting,polypeptide fragments resulting from protease digestion of a protein can be identified through nanopore sensing.Despite these advances,further improvements in protein engineering,data processing,identification accuracy,and read length are needed to make these strategies practically useful.This review provides an overview of the current major approaches to nanopore protein sequencing,emphasizing the strategies,recent advances,breakthroughs and challenges in nanopore protein sequencing.As nanopore technology continues to evolve,it is expected to offer more efficient and accurate sequencing solutions in proteomics,potentially leading to new technological breakthroughs in biochemistry and biomedicine.
protein sequencingnanoporestochastic sensingsingle molecule analysis