TY - JOUR T1 - Adapterama I: Universal Stubs and Primers for Thousands of Dual-Indexed Illumina Libraries (iTru & iNext) JF - bioRxiv DO - 10.1101/049114 SP - 049114 AU - Travis C. Glenn AU - Roger A. Nilsen AU - Troy J. Kieran AU - John W. Finger, Jr. AU - Todd W. Pierson AU - Kerin E. Bentley AU - Sandra L. Hoffberg AU - Swarnali Louha AU - Francisco J. García-De León AU - Miguel Angel del Rio Portilla AU - Kurt D. Reed AU - Jennifer L. Anderson AU - Jennifer K. Meece AU - Samuel E. Aggery AU - Romdhane Rekaya AU - Magdy Alabady AU - Myriam Bélanger AU - Kevin Winker AU - Brant C. Faircloth Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/06/15/049114.abstract N2 - Next-generation DNA sequencing (NGS) offers many benefits, but major factors limiting NGS include reducing the time and costs associated with: 1) start-up (i.e., doing NGS for the first time), 2) buy-in (i.e., getting any data from a run), and 3) sample preparation. Although many researchers have focused on reducing sample preparation costs, few have addressed the first two problems. Here, we present iTru and iNext, dual-indexing systems for Illumina libraries that help address all three of these issues. By breaking the library construction process into re-usable, combinatorial components, we achieve low start-up, buy-in, and per-sample costs, while simultaneously increasing the number of samples that can be combined within a single run. We accomplish this by extending the Illumina TruSeq dual-indexing approach from 20 (8+12) indexed adapters that produce 96 (8×12) unique combinations to 579 (192+387) indexed primers that produce 74,304 (192×387) unique combinations. We synthesized 208 of these indexed primers for validation, and 206 of them passed our validation criteria (99% success). We also used the indexed primers to create hundreds of libraries in a variety of scenarios. Our approach reduces start-up and per-sample costs by requiring only one universal adapter which works with indexed PCR primers to uniquely identify samples. Our approach reduces buy-in costs because: 1) relatively few oligonucleotides are needed to produce a large number of indexed libraries; and 2) the large number of possible primers allows researchers to use unique primer sets for different projects, which facilitates pooling of samples during sequencing. Although the methods we present are highly customizable, resulting libraries can be used with the standard Illumina sequencing primers and demultiplexed with the standard Illumina software packages, thereby minimizing instrument and software customization headaches. In subsequent Adapterama papers, we use these same iTru primers with different adapter stubs to construct double-to quadruple-indexed amplicon libraries and double-digest restriction-site associated DNA (RAD) libraries. For additional details and updates, please see http://baddna.org.adaptersoligonucleotides of known sequence that are ligated onto the ends of nucleic acids for the purpose of further manipulation or NGS library construction. In this paper we will make use of double-stranded DNA adapter stubs (see below).barcodessee index or tag; this term is also used to mean a DNA sequence that can be used to identify the taxon from which a sample derives, thus we avoid using this ambiguous term.clustera group of molecules on an Illumina flow cell that have been clonally amplified via bridge PCR or newer approaches (i.e., all molecules in a cluster are replicates of a single starting molecule from an Illumina library).demultiplexto separate pooled (multiplexed) sample information into their constituent parts (i.e., assign reads to specific samples)identifying sequencessee index or tag.index or taga short, unique sequence of DNA added to samples so they can be pooled and sequenced in parallel, with each resulting sequence containing information to identify the source sample. Some authors and companies refer to such sequences as barcodes or molecular identifiers (MIDs). We use “Illumina index” when referring to specific sequences designed by Illumina, “tag” when specifically referring to sequences from Faircloth and Glenn (2012), and “index” when generically referring to identifying sequences in adapters and primers compatible with Illumina instruments.Index Read 1the DNA sequence obtained from the 2nd Illumina sequencing reaction, yielding the i7 index sequence, which is placed into the header of Read 1 and Read 2 (if present)Index Read 2the DNA sequence obtained from the 3rd Illumina sequencing reaction, yielding the i5 index sequence, which is placed into the header of Read 1 and Read 2 (if present)i5 indexthe second indexing position introduced by Illumina, obtained by index Read 2, which is the 3rd read of a cluster made by Illumina instruments.i7 indexthe original indexing position used in Illumina sequencing, obtained by index Read 1, which is the 2nd read of a cluster made by Illumina instruments.iNextdual-index library preparation methods presented herein that are compatible with Illumina Nextera libraries.iTrudual-index library preparation methods presented herein that are compatible with Illumina TruSeq libraries.librarya population of molecules with adapters on each end of each molecule to facilitate sequencing.MIDmolecular identifier, term commonly used with 454 sequencing; see index or tag.multiplexsamples that are pooled together and processed or sequenced all at once.P5an engineered DNA sequence that is: 1) incorporated into adapters of Illumina libraries for bulk amplification of library molecules and 2) manufactured as oligonucleotides grafted onto the surface of Illumina flow cells and used for clonal amplification of library molecules, and priming the 3rd sequencing reaction on MiSeq and HiSeq ≤2500 instruments.P7an engineered DNA sequence that is: 1) incorporated into adapters of Illumina libraries for bulk amplification of library molecules and 2) manufactured as oligonucleotides grafted onto the surface of Illumina flow cells and used for clonal amplification of library molecules.paired-end readsDNA sequences obtained from sequencing each strand of DNA templates within clusters (see Fig. 4).primerssingle-stranded oligonucleotides used to initiate strand elongation for sequencing or amplificationRead 1the DNA sequence obtained from the 1st Illumina sequencing reaction, obtained as a fastq file with headers that contain data from indexing reads 1 and 2.Read 2the DNA sequence obtained from the 4th Illumina sequencing reaction, obtained as a fastq file with headers that contain data from indexing reads 1 and 2.sequence diversitythe base composition of nucleotides across all clusters being sequenced at any given base position. Illumina sequencing requires sequence diversity for successful determination of a base call.stubsshort universal adapters that are formed by annealing two oligonucleotides together (Illumina Read1 and Read2 sequences) and attaching that double-stranded product to template DNA via ligation. In the iTru strategy, y-yoke adapter stubs are comprised of oligonucleotides with the Read1 and Read2 sequences.y-yokean adapter that is formed from two oligonucleotides that are complementary on only one end to form a product that is double-stranded at one end, but single-stranded at the other end. ER -