Better understanding and prediction of antiviral peptides through primary and secondary structure feature importance


The emergence of viral epidemics throughout the world is of concern due to the scarcity of available effective antiviral therapeutics. The discovery of new antiviral therapies is imperative to address this challenge, and antiviral peptides (AVPs) represent a valuable resource for the development of novel therapies to combat viral infection. We present a new machine learning model to distinguish AVPs from non-AVPs using the most informative features derived from the physicochemical and structural properties of their amino acid sequences. To focus on those features that are most likely to contribute to antiviral performance, we filter potential features based on their importance for classification. These feature selection analyses suggest that secondary structure is the most important peptide sequence feature for predicting AVPs. Our Feature-Informed Reduced Machine Learning for Antiviral Peptide Prediction (FIRM-AVP) approach achieves a higher accuracy than either the model with all features or current state-of-the-art single classifiers. Understanding the features that are associated with AVP activity is a core need to identify and design new AVPs in novel systems. The FIRM-AVP code and standalone software package are available at with an accompanying web application at