ABSTRACT
Motivation Antimicrobial peptides (AMPs) have the potential to tackle multidrug-resistant pathogens in both clinical and non-clinical contexts. The recent growth in the availability of genomes and metagenomes provides an opportunity for in silico prediction of novel AMPs. However, due to the small size of these peptides, standard gene prospection methods cannot be applied in this domain and alternative approaches are necessary. In particular, standard gene prediction methods have low precision for short peptides, and functional classification by homology results have low recall.
Results Here, we present a novel set of 22 peptide features. These were used to build classifiers which perform similarly to the state-of-the-art in the prediction of both antimicrobial and hemolytic activity of peptides, but with enhanced precision (using standard benchmarks). We use these classifiers to build MACREL—Meta(genomic) AMPs Classification and REtrievaL—an end-to-end tool which combines assembly, ORF prediction, and AMP classification to extract AMPs directly from genomes or metagenomes. We demonstrate that MACREL recovers high-quality AMP candidates from genomes and metagenomes using realistic simulations and real data.
Availability MACREL is implemented in Python. It is available as open source at https://github.com/BigDataBiology/macrel and through bioconda. Classification of peptides or prediction of AMPs in contigs can also be performed on the webserver: http://big-data-biology.org/software/macrel.
Supplementary information Supplementary data are available online.
Footnotes
After some feedbacks about the pipeline's name, we decided to update the pre-print to this version in which the previous pipeline formerly known as FACS was renamed as MACREL, an acronym to Meta(genomic) AMPs Classification and REtrievaL, and other minor changes in the methods and the results sections.