Identification of SO42-and PO43-Binding Sites by U-RF Algorithm Integrating Single-Residue Information
The SO42-and PO43-play crucial roles in binding with proteins,making accurate prediction of protein-anion binding residues essential.Previous research on anion binding sites has primarily focused on the fragment level,while neglected the single residue level.This potentially leads to information loss.Therefore,features were extracted at both fragment and single residue levels to avoid information loss.At the fragment level,this study utilized amino acid,secondary structure,relative solvent accessibility,and hydrophilic-hydrophobic extracted from previous studies as foundational features,and introduced single residue-level amino acid,amino acid acid-base polari-ty,energy,and hydrophilic-hydrophobic propensity factors,Combining neighboring residue pair information and 9 orthogonal factors as new features.An algorithm(U-RF)combining undersam-pling and random forest was performed,and a promising prediction result was verified by five-fold cross-validation and independent testing.
binding residuesingle-residue levelpropensity factoracid-base polarityneigh-boring residue pair information