TY - JOUR T1 - Assessment of Antibody Library Diversity through Next Generation Sequencing and Technical Error Compensation JF - bioRxiv DO - 10.1101/085498 SP - 085498 AU - Marco Fantini AU - Luca Pandolfini AU - Simonetta Lisi AU - Michele Chirichella AU - Ivan Arisi AU - Marco Terrigno AU - Martina Goracci AU - Federico Cremisi AU - Antonino Cattaneo Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/11/03/085498.abstract N2 - Antibody libraries are important resources to derive antibodies to be used for a wide range of applications, from structural and functional studies to intracellular protein interference studies to developing new diagnostics and therapeutics. Whatever the goal, the key parameter for an antibody library is its diversity, i.e. the number of distinct elements in the collection, which directly reflects the probability of finding in the library an antibody against a given antigen, of sufficiently high affinity. Quantitative evaluation of antibody library diversity and quality has been for a long time inadequately addressed, due to the high similarity and length of the sequences of the library. Diversity was usually inferred by the transformation efficiency and tested either by fingerprinting and/or sequencing of a few hundred random library elements. Inferring diversity from such a small sample is, however, very rudimental and gives limited information about the real complexity, because complexity does not scale linearly with sample size. Next-generation sequencing (NGS) has opened new ways to tackle the antibody library diversity quality assessment. However, much remains to be done to fully exploit the potential of NGS for the quantitative analysis of antibody repertoires and to overcome current limitations. To obtain a more reliable antibody library complexity estimate here we show a new, PCR-free, NGS approach to sequence antibody libraries on Illumina platform, coupled to a new bioinformatic analysis and software (Diversity Estimator of Antibody Library, DEAL) that allows to reliably estimate the diversity, taking in consideration the sequencing error. ER -