Screening of pathogenic factors of Streptococcus pneumoniae in children based on whole-genome sequencing technology
OBJECTIVE To screen the molecular characteristics of Streptococcus pneumoniae strains(the infection and colonization strains)by the whole-genome sequencing technology so as to reveal the pathogenesis-related markers.METHODS The common clinical specimens(blood,cerebrospinal fluid,pleural fluid,sputum,parana-sal sinus inhalation and otic secretions)were collected from the children who were treated in Liuzhou Maternity and Children Healthcare Hospital from Dec 2015 to Jan 2021 and were assigned as the infection group.7 kinder-gartens of Foshan were randomly selected by cluster sampling method from Oct 2020 to Dec 2020,the nasopha-ryngeal swab specimens were collected from the healthy children of the kindergartens and were assigned as the col-onization group.The strains of infection were the S.pneumoniae strains causing the clinical infection in the chil-dren,and the strains of colonization were the nasopharyngeal colonized S.pneumoniae strains in the healthy chil-dren from communities.The drug resistance genes and virulence genes in the S.pneumoniae strains were fully de-tected by the whole-genome sequencing technology.The pathogenesis-related markers of the S.pneumoniae strains were screened out by means of univariate analysis(chi-square test)and machine learning method(random for-est).RESULTS The carrying rates of virulence genes pce,pitA,pitB,sip A,rrgA,rrgB,rrgC,srtG1,srt G2,srtC1,srtC2,srtC3,nanA,nanB,cps4A,cps4B,cps4D and zmpC of the infection strains were higher than those of the colonization strains(P<0.05).The carrying rates of cbipG,pfbA,lytA and lytB genes of the infec-tion strains were lower than those of the colonization strains(P<0.05).The carrying rates of drug resistance gene ermC,mef A and msrD of the infection strains were higher than those of the colonization strains(P<0.05);the carrying rate of cat(pC194)gene of the infection strains was lower than that of the colonization strains(P<0.05).Totally 6 pathogenic molecular characteristics(virulence genes pce,rrgA,lytA,lytB and zmpC,drug resistance gene msrD)were screened out by random forest model,the accurate rate of cross validation was 70.11%,and the area under curve(AUC)was 0.73.CONCLUSION The random forest model may effectively screen out the patho-genesis-related markers of the S.pneumoniae strains and provide genetic evidence for tracing high-pathogenicity in-fectious sources and carrying out precise targeted interventions.
Streptococcus pneumoniaeChild in the hospitalChild in the kindergartenMolecular characteristicInfectionColonization