摘要
The Severe Acute Respiratory Syndrome CoronaVirus 2(SARS-CoV-2)virus spread the novel CoronaVirus−19(nCoV-19)pandemic,resulting in millions of fatalities globally.Recent research demonstrated that the Protein-Protein Interaction(PPI)between SARS-CoV-2 and human proteins is accountable for viral pathogenesis.However,many of these PPIs are poorly understood and unexplored,necessitating a more in-depth investigation to find latent yet critical interactions.This article elucidates the host-viral PPI through Machine Learning(ML)lenses and validates the biological significance of the same using web-based tools.ML classifiers are designed based on comprehensive datasets with five sequence-based features of human proteins,namely Amino Acid Composition,Pseudo Amino Acid Composition,Conjoint Triad,Dipeptide Composition,and Normalized Auto Correlation.A majority voting rule-based ensemble method composed of the Random Forest Model(RFM),AdaBoost,and Bagging technique is proposed that delivers encouraging statistical performance compared to other models employed in this work.The proposed ensemble model predicted a total of 111 possible SARS-CoV-2 human target proteins with a high likelihood factor≥70%,validated by utilizing Gene Ontology(GO)and KEGG pathway enrichment analysis.Consequently,this research can aid in a deeper understanding of the molecular mechanisms underlying viral pathogenesis and provide clues for developing more efficient anti-COVID medications.