Similarity-based codes sequentially assigned to ebolavirus genomes are informative of species membership, associated outbreaks, and transmission chains.

TR Number



Journal Title

Journal ISSN

Volume Title



Background.  Developing a universal standardized microbial typing and nomenclature system that provides phylogenetic and epidemiological information in real time has never been as urgent in public health as it is today. We previously proposed to use genome similarity as the basis for immediate and precise typing and naming of individual organisms or viruses. In this study, we tested the validity of the proposed system and applied it to the epidemiology of infectious diseases using Ebola virus disease (EVD) outbreaks as the example. Methods.  One hundred twenty-eight publicly available ebolavirus genomes were compared with each other, and average nucleotide identity (ANI) was calculated. The ANI was then used to assign unique codes, hereafter referred to as Life Identification Numbers (LINs), to every viral isolate, whereby each LIN consisted of a series of positions reflecting increasing genome similarity. Congruence of LINs with phylogenetic and epidemiological relationships was then determined. Results.  Assigned LINs correlate with phylogeny at the species and infraspecies level and can even identify some individual transmission chains during the 2014-2015 EVD epidemic in West Africa. Conclusions.  Life Identification Numbers can provide a fast, automated, standardized, and scalable approach to precisely identify and name viral isolates upon genome sequence submission, facilitating unambiguous communication during disease epidemics among clinicians, epidemiologists, and governments.



average nucleotide identity, classification, ebolavirus, epidemiology, phylogeny