Genomic and Chemical Diversity in Cannabis

Ryan C. Lynch; Daniela Vergara; Silas Tittes; Kristin White; C.J. Schwartz; Matthew J. Gibbs; Travis C. Ruthenburg; Kymron deCesare; Donald P. Land; Nolan C. Kane

doi:10.1101/034314

Abstract

Plants of the Cannabis genus are the only producers of phytocannabinoids, terpenoid compounds that strongly interact with evolutionarily ancient endocannabinoid receptors shared by most bilaterian taxa. For millennia, the plant has been cultivated for these compounds, but also for food, rope, paper, and clothing. Today, specialized varieties yielding high-quality textile fibers, nutritional seed oil or high cannabinoid content are cultivated across the globe. However, the genetic identities and histories of these diverse populations remain largely obscured. We analyzed the nuclear genomic diversity among 339 Cannabis varieties, and demonstrate the existence of at least three major groups of diversity. As well as being genetically distinct, each group produces unique cannabinoid and terpenoid content profiles. This combined analysis of population genomic and trait variation informs our understanding of the potential uses of different genetic variants for medicine and agriculture, providing valuable insights and tools for a rapidly emerging, valuable legal industry.

Plants of the genus Cannabis (Cannabaceae; hemp, drug-type) have been used for thousands of years for fiber, nutritional seed oil and medicinal or psychoactive effects. Archaeological evidence for hemp fiber textile production in China dates to at least as early as 6,000 years ago ¹, but possibly as early as 12,000 years ago², suggesting Cannabis was one of the first domesticated fiber plants. Archeological evidence for medicinal or shamanistic use of Cannabis has been found at Indian, central-Asian and middle-eastern sites³, further illustrating the widespread extent of Cannabis utilization throughout human history. A central Asian site of domestication is often cited⁴, although genetic analyses suggest two independent domestication events may have occurred separately⁵.

Cannabis plants are usually annual wind-pollinated dioecious herbs, though individuals may live more than a year in subtropical climates⁶ and monoecious populations exist⁷. The taxonomic composition of the genus remains unresolved, with two species (C. indica and C. sativa) commonly cited⁸, although C. ruderalis is sometimes proposed as a third species that contains northern short-day or auto-flowering plants⁹. Monospecific treatment of the genus as Cannabis sativa L. is also common¹⁰ and various alternative nomenclature schemes (e.g. Cannabis sativa subsp. indica var. kafiristanica) are sometimes referenced (reviewed in⁴). Even though an extensive monograph on the genus has recently been published¹¹, limited genetic and experimental data leaves the questions of taxonomy unresolved^12,13.

The geographical and ecological range of Cannabis is unusually broad, with cultivated populations growing outdoors on every continent except Antarctica in a wide range of environments from sub-arctic to temperate to tropical, and from sea level to over 3,000 meters elevation^14,15. Feral or wild populations are also found as far north as the edge of the Arctic Circle in Eurasia, but are most common in well drained soils of temperate continental ecosystems in Eurasia and North America, while tropical populations are absent or rare¹⁴. Perhaps unsurprising, given this diversity of habitats, the species contains extensive phytochemical diversity, particularly in cannabinoid and terpenoid profiles^5,16, and also shows extensive diversity of morphological and life-history characteristics, further fueling debate regarding the taxonomic status and origins of Cannabis domestication.

One distinctive feature of the Cannabis genus is the production of a tremendous diversity of compounds called cannabinoids, so named because they are not produced at high levels in any other plant species¹⁷. Cannabinoids are a group of at least 74 known C21 terpenophenolic compounds^18,19 responsible for many reported medicinal and psychoactive effects of Cannabis consumption²⁰. Some estimates for the total number of phytocannabinoids range to well over a hundred²¹, though this number includes breakdown products as well as compounds found at extremely low levels. The plants produce a non-psychoactive carboxylic acid form of these compounds, with heating required to convert cannabinoids into the psychoactive decarboxylated forms. Interestingly, these compounds have pronounced neurological effects on a wide range of vertebrate and invertebrate taxa, suggesting an ancient origin of the endocannabinoid receptors, perhaps as old as the last common ancestor of all extant bilaterians, over 500 MYA²². The plant compounds thus produced have the potential to affect a broad range of metazoans, though their ecological functions in nature are not well understood. Indeed, suggested roles for these compounds include many biotic and abiotic defenses, such as suppression of pathogens and herbivores, protection from UV radiation damage, and attraction of seed dispersers. These hypotheses about the selective benefits of cannabinoid production remain speculative, as none have been conclusively verified to date. We do know more, however, about the more recent evolution of the plants under human cultivation.

High delta-9-tetrahydrocannabinolic acid (THCA)²³ content has been selected for in many strains due to its potential to be converted to delta-9-tetrahydrocannabinol (THC), which has potent psychoactive²⁴, appetite-stimulating²⁵, analgesic²⁶ and antiemetic²⁷ effects. These effects are mediated through interactions with human endocannabinoid CB1 receptors found in the brain ²⁸, and CB2 receptors, which are concentrated in peripheral tissues²⁹. Other THC receptor binding locations are hypothesized as well³⁰. After several decades of accelerated clandestine cultivation technique and breeding improvements, modern strains can now yield dried un-pollenated pistillate inflorescence material that contains over 30% THCA by dry-weight³¹. However, other cannabinoids may also be present in high concentrations. In particular, high cannabidiolic acid (CBDA) plants were historically used in some hashish preparations³² and are presently in high demand as an anti-seizure therapy³³. In contrast with THC, which acts as a partial agonist of the CB1 and CB2 receptors, CBD does not have as strong psychoactive properties, and instead has antagonist activity on agonists of the CB1- and CB2-receptors³⁴. Thus, the two most abundant cannabinoids produced in Cannabis have, to some degree, opposing neurological effects.

THCA and CBDA are alternative products of a shared precursor, CBGA³⁵. A single locus with co-dominant alleles was proposed to explain patterns of inheritance for THCA to CBDA ratios^7,36. However more recent quantitative trait loci (QTL) mapping experiments³⁷, expression studies³⁸ and genomic analyses¹⁰ paint a more complex scenario with several linked paralogs responsible for the various THCA and CBDA phenotypes. Other cannabinoids such as cannabigerol (CBG)³⁹, cannabichromene (CBC)⁴⁰ and delta-9-tetrahydocannabivarin (THCV)⁴¹ demonstrate pharmacological promise, and can also be produced at high levels by the plant^42–44. Additionally, Cannabis secondary metabolites such as terpenoids and flavonoids likely contribute to therapeutic or psychoactive effects², such as β-myrcene, humulene and linalool proposed to produce sedative effects associated with specific strains⁴⁵.

In this study, plants that produce low levels of total cannabinoids are herein referred to as hemp, while high cannabinoid producing varietals are described as drug-type strains. Legal definitions often use a maximum THCA threshold to delineate hemp from drug-types, thus some high CBDA producing strains are categorized as hemp. However this definition ignores the broader traditional usage of hemp for fibers or seed oils and historical presence of CBDA-producing alleles in some drug-type populations³². Additionally, hemp strains have a distinct set of growth characteristics, with fiber varieties reaching up to 6 meters in height during a growing season, exhibiting reduced flower set, increased internodal spacing and lower total cannabinoid concentration per unit mass compared to drug-type relative. Despite the widespread prohibition of drug-type Cannabis cultivation from the 1930s to present, hemp cultivation and breeding continued in parts of Europe and China though this period, and experienced a brief comeback during World War II in the USA through the Hemp for Victory campaign. Studies to date have found hemp varieties are genetically distinct from drug-type strains¹⁰, though interestingly Hillig⁵ found broad leaflet southeastern Asian hemp landraces to be more closely related to Asian drug-type strains than to European hemp strains.

Cannabis has a diploid genome (2n = 20), and an XY/XX chromosomal sex-determining system. The genome size is estimated to be 818 Mb for female plants and 843 Mb for male plants⁴⁶. Currently a draft genome consisting of 60,029 scaffolds is available for the Purple Kush (PK) drug-type strain from the National Center for Biotechnology Information. Additional whole genome data is available from NCBI for the Finola and USO31 hemp strains. Various reduced representation genome, gene and RNA sequence data are also available from NCBI. Presently Cannabis is the only multi-billion dollar legal crop without a sequence-based genetic linkage or physical genome map. Indeed, the first genetic map for the species, using AFLP and microsatellite markers, was only recently published, providing for the first time, quantitative trait mapping of cannabinoid content and other traits³⁷.

Initial studies of Cannabis genetic diversity examined either many samples with few molecular markers⁵ or whole genome wide data for relatively few samples types¹⁰. Sawler et al.⁴⁷ recently published a survey of Cannabis genomic diversity, using a reduced genomic representation strategy to evaluate 81 marijuana (drug-type) and 43 hemp strains. The aim of this present study is to assess the genomic diversity and phylogenetic relationships among 339 total Cannabis plants that have distinct phenotypes, and that were described a priori by plant breeders as various landraces, indica, sativa, hemp and drug-types, as well as commercially available hemp and drug-types with unclear pedigrees. We have combined data from existing sources and generated new data to create the largest sample set of Cannabis genomic sequence data published to date. These data and analyses will continue to facilitate the development of modernized breeding and quality assurance tools, which are lacking in the nascent legal Cannabis industry.

Materials and Methods

Sample collection. DNA was obtained from numerous sources, including a variety of breeding and production facilities. The strain names, descriptions and putative origins used in this paper were recorded from the providers of the DNA and sequence data (Supplementary Table 1). For data not previously published, DNA extractions were performed using the Qiagen DNeasy Plant Mini Kit (Valencia, CA) according to the manufacturer’s protocol.

Whole genome shotgun (WGS) sequencing. 60 samples were sequenced using standard Illumina multiplexed library preparation protocols for two 2 × 125 HiSeq 2500 lanes and one 2 × 150 NextSeq 500 run. Sequencing efforts were targeted for approximately 4-6x coverage of the Cannabis genome per sample.

Genotype-by-Sequencing (GBS). 182 samples were sequenced on two 1 × 100 HiSeq 2500 lanes, following a multiplexed library preparation protocol described previously⁴⁸.

Publically available data. We obtained three WGS datasets available from NCBI¹⁰ and received seven additional WGS datasets from Medicinal Genomics Corporation (www.medicinalgenomics.com). GBS data for 143 samples from Sawler et al.⁴⁷ were also included in this study.

Sequence Processing, Alignment and SNP calling. Trimmomatic⁴⁹ was used to trim any remaining adaptor sequence from raw fastq reads and remove sequences with low quality regions or ambiguous base calls using the following settings:

ILLUMINACLIP:IlluminaAdapters:2:20:10 LEADING:20 TRAILING:20

SLIDINGWINDOW:5:15 MINLEN:100. Trimmed raw reads from the 67 total WGS samples were then aligned to the only publicly available draft genome of PK (JH226140-JH286168) using the Burrows-Wheeler Alignment tool (BWA mem)⁵⁰. Chloroplast and mitochondrial regions were excluded. We collated the individual alignments to produce a single variant call format table (.vcf) for all samples using samtools mpileup -uf | bcftools view –bvcg⁵¹. We filtered the vcf table to include only high quality informative SNP sites using vcftools⁵², bash and awk with the following vcf parameters: Q ( >200), GQ (>10), AF1 (.1 - .9), biallelic sites only and no ambiguous bases. Next, data filters were applied through plink⁵³ to require that individuals have a minimum 50% informative sites and that sites each have data for minimum 20% of samples. Finally we used an estimate of expected coverage for the single copy portion of the genome based on the estimated genome size and number of reads being aligned. This was adjusted empirically based on total coverage level (across all WGS samples) per SNP site (Supplementary Figure 1) and bounded by a 95% Poisson confidence interval (mean 362x coverage). Further removal of repetitive content was achieved by aligning the PK reference to itself with BLASTN and removing all sites that were within regions of ≥ 97% identity for at ≥ 500 bp alignments. These aforementioned processing, alignment and SNP calling procedures were then preformed separately on the 182 GBS samples generated for this study and the 143 GBS samples previously published⁴⁷, resulting in three vcf tables and filtered SNP sets. GBS SNPs were additionally required to have a minimum of 5x coverage per sample. Due to limited overlap between the SNP sites produced by the two GBS libraries, most downstream analyses were performed separately for each GBS library along with its corresponding set of WGS SNPs. Code used for these analyses is available at https://github.com/KaneLab.

SNP Analyses. To visualize genetic relationships, divergence, and ancestral hybridization among lineages, a phylogenetic neighbor network was inferred using simple p-distance calculations⁵⁴. Heterozygosity counts and Multidimensional Scaling (MDS) analyses were calculated with Plink⁵³. Average within and between group genetic distances, and a 45 SNP alignment neighbor joining tree based on p-distances, were calculated with MEGA6⁵⁵. Population structure inferences were made through FastStructure⁵⁶ and FLOCK⁵⁷. Tests for reticulation within the trees and admixture between populations were performed in TreeMix⁵⁸ F_ST estimates were calculated with vcftools⁵².

Chemical Analyses of Genetic Groups. The cannabinoid and terpenoid information (chemotype) for a portion of the strains in the genome analysis were generated by Steep Hill Labs (http://steephill.com/). Only strains with matching data in the genomic analysis were analyzed, for a total of 112 individuals from 17 strains from the BLDT group, 278 individuals from 35 unique strains from the NLDT group, and 33 individuals from two strains of hemp, for a total of 423 individuals in this analysis (Supplementary Table 1). This chemotype analysis was performed using high performance liquid chromatography (HPLC) with Agilent (1260 Infinity, Santa Clara, CA) and Shimadzu (Prominence HPLC, Columbia, MD) equipment. Between 400 and 600 milligrams of each sample was extracted into methanol, diluted and analyzed by HPLC. A mobile phase consisting of 0.1% formic acid in water and 0.1% formic acid in methanol was used with a gradient starting at 72% methanol and ending at 99% methanol. Terpenoid standards were purchased from Sigma-Aldrich (St. Louis, MO). Cannabinoid standards were purchased from Cerilliant (Round Rock, TX), RESTEK (Bellefonte, PA) and Lipomed (Cambridge, MA). A C18 column from RESTEK (Raptor ARC-18, Bellefonte, PA) or Phenomenex (Kinetex C18, Torrance, CA) was used. Concentrations of cannabinoids without commercially available standards were estimated using published absorptivies⁵⁹. The chemotype data analyzed for this research includes 13 cannabinoids and eight terpenoids. Each compound was quantified using a linear calibration curve. Analytes were measured as percent mass in sample and not corrected for moisture content.

We performed a one-way ANOVA for each cannabinoid and terpenoid separately, with the group (NLDT, BLDT, and hemp) as the predictor variable. We used Bonferroni corrections for multiple comparisons. We also implemented a Principal Component Analysis (PCA) with prcomp function in base R, and car was used to visualize 95% confidence ellipses for each group (www.R-project.org). Individuals with missing data values for any cannabinoid or terpenoid were removed. After removing the individuals with missing values, we had a total of 351 individuals: 94 BLDT, 229 NLDT, and 28 hemp.

Results and Discussion

Sequencing and SNPs. Summary information and raw sequencing libraries are publically available from the NCBI short read archive (accessions pending). Detailed information about all samples can be found in Supplementary Table 1 and examples of wide and narrow leaflet forms are shown in Figure 1. Of the 466,427,059 non-ambiguous base pairs in the PK reference, 77,810,563 bps were removed due to excess self-similarity (≥ 97 % identity and ≥ 500 bps length, Supplementary Figure 1). After this filter, the total single copy portion the PK reference within the combined coverage levels for all 67 WGS samples of 326x – 401x, a 95% Poisson confidence interval around a 362x mean, was 71,236,365 bps (Supplementary Figure 1). After quality (Q), genotype quality (GQ), allele frequency (AF), missing data, biallelic and ambiguous base filters, the following SNP counts remained: 491,341 WGS, 2,894 GBS (this study), SNPs 4,105 GBS (Sawler⁴⁷). Forty-five SNPs overlapped both GBS datasets, and the WGS samples.

Figure 1.

Example of broad leaflet type (a, R4) and narrow leaflet type (b, Super Lemon Haze) strains. Photograph credits: D. Vergara.

Phylogenetic Relationships. Bifurcating trees are commonly used to model mutation driven divergence and speciation events. Whole genome wide sequence datasets include information about recombination, hybridization, and gene loss or genesis events, some of which may be incongruent with one and other⁵⁴. Phylogenetic networks can represent incompatible phylogenetic signals across large character matrices in a visually informative manner. Figure 2 contains 195 Cannabis samples including WGS and GBS data, and shows that all European hemp strains form a distinct clade, separated from drug-type strains by a consistent band of parallel branches. Broad leaflet drug-type strains clustered with purported Afghan Kush landrace samples (Supplementary Table 1 and Supplementary Figure 3), while narrow leaflet drug-type strains appear to contain several groups with only faint visible distinctions between them, perhaps influenced by the inclusion of hybrid strains in the analysis.

Figure 2.

Phylogenetic neighbor network of a 2,894 SNP alignment from the single-copy portion of the Cannabis genome. Clade names on the periphery were inferred via FLOCK (where K ≥ 3 was most likely). Colored branches indicate fastStructure population membership of ≥ 70% assignment (where K=2 was most likely). NLDT = Narrow Leaflet Drug-Type and BLDT = Broad Leaflet Drug-Type. To SE Asian NLDT II points to Dr. Grinspoon and Somali Taxi Cab samples. To Broad Leaflet Hemp points to a Chinese hemp sample. A high-resolution version of this figure that includes each sample name is available from: http://figshare.com/articles/Cannabis_Tree/1585470/1

We found significantly more heterozygosity in drug-type strains than in hemp varieties (31 % v 22 %, p < 0.001, two-tailed Mann-Whitney U-test, Table 1). This likely reflects the widespread hybridization of strains in North America during the transition to indoor cultivation of drug-types starting in the 1970s⁶⁰, as well as the extensive reliance on clonal propagation for indoor commercial cultivation, which does not require trait stable seed stock. Conversely, fiber and seed oil hemp are grown on multi-acre scales that have necessitated the stabilization of agronomically important traits in seed stocks, likely leading to reduced heterozygosity at some loci.

View this table:

Table 1.

Summary of genetic distance, heterozygosity and F_st information for major Cannabis groups. * = significantly different (p < 0.001, two-tailed Mann-Whitney U-test).

Population Structure. To determine the statistical likelihood of various population scenarios represented in our samples, we first applied the FLOCK model to our data set of 195 GBS and WGS Cannabis samples, which is an iterative reallocation clustering algorithm that does not require non-admixed individuals to make population assignments⁵⁷. Using the K-partitioning method suggested by the authors⁵⁷, we determined that K ≥ 3, after testing K values of one to eight (Table 2 and peripheral population names in Figure 2). FLOCK was able to assign all samples to one of the three populations, although it does not calculate admixture proportions. Sample population assignments were largely consistent with the known history of these samples, and appear visually consistent with MDS analysis (Supplementary Figure 2). For example all fiber and seed oil hemps were assigned to an exclusive population, with the exception of sample AC/DC, a high CBDA producing variety, with likely hybrid hemp origins (Figure 2, Table 2).

View this table:

Table 2.

Sample names and FLOCK assignment to three groups, represented with different cell colors. Green are BLDT, blue are NLDT and yellow are hemp.

Additionally we applied the admixture model based Bayesian clustering method of fastStructure to the same 195 samples⁵⁶. The most likely population structure analysis of K=2 (Figure 2, Supplementary Table 1), shows consistent separation between BLDT and NLDT and hemp strains. Some hemp and NLDT strains were each assigned with near 100% population membership to the same population (Figure 2, light blue samples, Supplementary Table 1), despite the clear separation visualized in the tree and statistically significant mean between-group genetic distance measured (Table 1). The separation of BLDT and NLDT strains into fastStructure populations was stable when hemp samples were excluded from the analysis (Supplementary Table 1). Sawler et al.⁴⁷ used fastStructure to delineate hemp from drug-types as the major division of Cannabis diversity, and found two drug-type sub-groups within their samples when hemp types were excluded from the analysis. Likewise using a smaller dataset, Lynch⁶¹ found support for K=3, consisting of two separate drug-type populations and hemp types, using the original Structure implementation⁶² and the Evanno method to select the best value of K⁶³. However, we caution that despite many claims for the availability of “landrace genetics” (strains) from Cannabis producers, breeders and seed sellers, these may or may not represent non-admixed individuals⁶⁰—a situation that can be problematic for the Structure and fastStructure approaches⁶².

The GBS samples from Sawler et al.⁴⁷ appear to contain an additional divergent NLDT clade, with likely SE Asian origins (Supplementary Figures 3 and 4), that did not emerge from our main analyses. Due to very limited overlap between sequence fragments from the two GBS datasets, which results from using different restriction enzymes, we were required to re-analyze the Sawler data in combination with only our 67 WGS samples. A connection was made across the two GBS analyses to this SE Asian NLDT group through two WGS samples (Dr. Grinspoon and Somali Taxi Cab, Figure 2, Supplementary Figures 3) that were included in both sets of GBS analyses. Although only 45 SNPs overlapped between both types of GBS data and the WGS data, a phylogeny of this limited alignment also supports the existence of an additional distinct SE Asian NLDT clade (Supplementary Figure 4). Collectively these analyses lend support to a total lower bound of four Cannabis populations, although clearly more extensive sampling with consistent sequencing is required to fully access standing biogeographic diversity.

Tests of Tree Models. To test hypotheses of tree-like evolution for the three genetic groups, we first applied the three-population test for admixture⁶⁴, and found no evidence for admixture in any of the pairwise comparisons (positive f statistic values). Next we constructed maximum likelihood trees based on the aggregate SNP frequencies for the three genetic groups and simulated a variety of ‘migration’ events (0-10), but no simulation produced non-zero migration graph edges (Figure 3). Fst analysis shows little divergence among lineages for most loci, but a substantial number of highly-divergent regions are unique to each clade (Figure 4). This reinforces the importance of using many, high-quality, single-copy regions of the genome, rather than smaller numbers of loci that could lead to less resolution or even misleading results. Although lore⁶⁰, Figure 2 and Supplementary Figure 2 strongly suggest at least some individuals have hybrid origins, these tree models for the overall SNP frequencies of the population groups inferred by FLOCK (Table 2) imply each group contains strong genetic signals from ancestral biogeographic gene pools.

Figure 3.

Maximum likelihood tree of three Cannabis populations. We found no evidence for extensive admixture or deviations from this tree model.

Figure 4.

Distribution of Weir-Cockerham F_ST estimates for each population comparison. Each population pair has some portion of segregated sites.

Additional Cannabis diversity likely remains to be sampled. Notably absent from all genome sequence datasets published to date are putative C. ruderalis⁶⁵ samples. These are short weedy plants, with free shattering inflorescences found widely from northern Siberia, through central Asia and into Eastern Europe⁶⁶. Whether these populations represent ancestral, pre-domesticated wild Cannabis, more recent feral escapes or some combination of both remains unclear. Even though we were not able to sample putative C. ruderalis populations, Finola is an early maturing seed hemp strain from Finland with purported northern Russian landrace ancestry⁶⁰, and Low Ryder and Auto AK-47 are auto-flowering drug-type strains with possible C. ruderalis heritage included in our samples (Table 2). Our analyses found Finola fits within the hemp group while Low Ryder and Auto AK-47 are close relatives of each other within the NLDT group (Supplementary Figure 3). Further genomic analyses are required to determine the extent to which C. ruderalis populations are genetically distinct from hemp and drug-type groups, and whether they may in fact harbor an ancestral wild-type gene pool from which European hemp varieties were domseticated^5,16.

Broad leaflet Asian hemp is also underrepresented, although we included one putative Chinese hemp sample that occupies an area between the core hemp and BLDT populations (Table 2, Figure 2 and Supplementary Figure 2). Hillig’s⁵ analysis of alloenzymes concluded that Asian hemp strains were more similar to Asian drug-type strains than they were to narrow leaflet European hemp. Likewise, Gao et al.⁶⁷ found genetic dissimilarity between European hemp and Chinese hemp, using microsatellites, and showed at least several distinct groups of hemp occur across the vast geography of Asia. Overall, Asian and European hemp strains appear dissimilar genetically, possibly reflecting independent domestication events⁶⁶.

One major complication obscuring the understanding of Cannabis diversity and history is the lack of information about the native range or ranges of Cannabis. In addition to divergent breeding efforts and human-vectored transport of seeds, the tendency of Cannabis is to escape into feral populations wherever human cultivation occurs in temperate climates⁶⁸. This, coupled with wind pollination biology and no known reproductive barriers, makes the existence of pure wild native Cannabis populations unlikely. The weedy tendencies of Cannabis are exemplified by the mid-western USA populations of feral hemp that flourish despite the eradication efforts by the Drug Enforcement Agency, which have for decades totaled millions of plants removed per year. A comprehensive evaluation of Cannabis diversity, which includes feral and wild Eurasian populations, is required to ascertain if the levels of divergence and gene flow are consistent with one or more origins of domestication⁵. Even if these extant populations are highly admixed with modern varieties, their study promises to offer insight into Cannabis ecology and evolution, given how different the selective regime of the feral setting is compared to that of agricultural fields. Considering the similar debates regarding the timing and origins of Oryza domestication that remain as of yet unresolved⁶⁹, Cannabis requires substantially more work to unravel its complicated relationship with humans.

‘Indica’ and ‘sativa’ are commonly used terms ascribed to plants that have certain characteristics, often related to leaflet morphology and the perceived effects of consuming the plant⁸. However these names are rooted in taxonomic traditions dating to Linnaeus who first classified the genus as monotypic (Cannabis sativa) based on hemp specimens from Virginia and Europe⁷⁰. Lamarck subsequently designated Cannabis indica to accommodate the shorter stature potent narrow leaflet drug-type plants from the Indian subcontinent⁷¹. Although currently the term ‘indica’ is typically used to refer to BLDTs, this biotype from the Hindu-Kush mountains¹⁴ was not documented until a 1929 survey of Afghani agriculture by Vavilov⁷². This absence of historical documentation until the 20^th century, a very narrow geographic range, and some evidence for a broader NLDT gene pool (Table 1, Supplementary Figures 3 and 4), suggest a separate and more recent origin of the BLDT clade. This origin could represent a domestication event of a wild or feral BLDT population, or perhaps hybridization events between NLDT and BLDT populations. Final resolution of Cannabis taxonomy will require complete assessment of standing global genetic diversity and experimental evaluation of reproductive compatibility across all major genetic groups⁷³, in conjunction with morphological circumscriptions. Given the current absence of evidence for reproductive barriers, and overall limited genetic distances between hemp and drug-type strains analyzed in this study we suggest continued monotypic treatment of plants in this genus as Cannabis sativa L. is warranted.

Cannabinoid and Terpenoid Diversity. THCA and CBDA are the most abundant cannabinoids produced by the majority of strains on the North American market today (Figure 5a), and both compounds show an impressive range of medicinal potential^33,74, although endocannabinoid-based therapy trials have a history of significant rates of study withdraws and adverse effects⁷⁵. Historical breeding efforts have resulted in mostly high THCA plants that produce strong intoxicating effects when consumed, and that synthesize only very low levels of alternative cannabinoids (Figure 5b). High CBDA plants have only recently become more available in North America over the last several years in response to demand. Interestingly, these high CBDA-producing plants form several clusters within the both the NLDT and BLDT groups, as well as within the hemp group (Supplementary Table 1), but rarely reach equivalent quantities total cannabinoid production as those found in high THCA plants (Figure 4a). The minor cannabinoids that are commonly assayed, CBGA, CBCA, THCVA and CBDVA are also of interest, despite strains producing high levels of these compounds being largely unavailable for research currently⁷⁶. With at least 74 cannabionoids identified in Cannabis, modernized genetic and breeding techniques are required to diversify and optimize Cannabis varieties. Efforts should also be made to document and preserve feral, wild and heirloom populations that can serve as reservoirs of cultural and genetic diversity.

Aromatic terpenoids impart many of the characteristic fragrances to Cannabis, and possibly contribute to the effects of consumption². Terpenoids are synthesized in many plant species, and play a role in relieving various abiotic and biotic stresses through direct and indirect mechanisms². Our analysis of strains sharing common genetic groups shows that each group has a distinct terpenoid profile (Figure 5c and Supplementary Figure 5). We found NLDTs to contain significantly more β-myrcene and α-terpinolene than BLDTs, although interestingly the two hemp strains for which we analyzed chemical data for had significantly more β-myrcene than either drug-type group (Figure 5c). Similarly Hillig⁷⁷ found NLDTs to yield significantly more β-myrcene than Afghani BLDTs, yet European hemp and un-cultivated accessions labeled as C. ruderalis contained the highest levels. Hillig also reported that Afghani BLDTs contained the highest levels of guaiol and eudesmol isomers, which we did not measure, although we found BLDTs contained more linalool than NLDTs or hemp. Understanding the ecological functions and evolutionary origins of terpenoids and cannabinoids in Cannabis could improve therapeutic potential, and possibly reduce the need for pesticide application during cultivation.

Figure 5.

Average percentage of mass for dried and un-pollenated female flowers of Cannabis genetic groups. (a) THCA and CBDA cannabinoids (b) Minor cannabinoids (c) Terpenoids. THCA = delta-9-tetrahydrocannabinolic acid. CBDA = cannabidiolic acid. CBD = cannabidiol. CBN = cannabinol. D9-THC = delta-9-tetrahydrocannabinol. D8-THC = delta-8-tetrahydrocannabinol. CBGA = cannabigerolic acid. CBG = cannabigerol. THCVA = Tetrahydrocannabivarin carboxylic acid. THCV = Tetrahydrocannabivarin. D4-THC = delta-4-tetrahydrocannabinol. CBC = cannabichromene. CBLA = cannabicyclolic acid.

Conclusions. Cannabis genomics offers a window into the past, but also a road forward. Although historical and clandestine breeding efforts have been clearly successful in many regards^21,31, Cannabis lags decades behind other major crop species in many other respects. Developing stable Cannabis lines capable of producing the full range of potentially therapeutic cannabinoids is important for the research and medical communities, which currently lacks access to diverse high-quality material in the USA⁷⁸.

In this paper we extended the initial Cannabis genome study¹⁰, by re-mapping WGS and GBS sequence reads to the existing PK draft scaffolds, to understand diversity and evolutionary relationships among the major lineages. Although hybridization of cultivated varieties⁶⁰ and human transport of seeds across the globe was hypothesized to have obscured much of the ancestral genetic signal¹³, we found significant evidence for apparent ancestral signals in genomic data derived largely from modern cultivated varieties (Table 2, Figures 2 and 3). Reanalysis of previously published GBS data⁴⁷ provides additional limited evidence for a fourth group (Supplementary Figures 4 and 5). Interestingly, unique cannabinoid and terpenoid profiles were associated with three of the genetic groups, lending support to their validity, despite the limitations of our sampling scheme. Overall, we hope the publicly available data and analyses from this study will facilitate the continued research on the history of this controversial plant and the development of the agricultural and therapeutic potential of Cannabis.

Acknowledgments:

We thank Ben Holmes of Centennial Seeds; Devin Liles, Carter Casad and Jan Cole of The Farm; Ashley Edwards of Ward, Colorado; Jake Salazar of MMJ America; Kevin McKernan of Medicinal Genomics; David Salama, Ashley and Matt Rheingold of Headquarters; Ezra Huscher; Nico Escondido and Bob Sievers for providing DNA samples. We thank Reggie Gaudino of Steep Hill for assistance with the chemical data. This project was supported by donations to the University of Colorado Foundation gift fund 13401977-Fin8 to NCK.

Author contributions: RCL, DV, KHW and NCK designed the project. CJS and MJG collected samples. KHW generated DNA sequencing libraries. RCL, NCK and SBT performed bioinformatics analyses. KdC, DPL and TCR generated chemical data. DV and SBT performed chemical data analyses. RCL, DV and NCK wrote the paper.

Competing financial interests: RCL is presently an employee of Medicinal Genomics Corporation. DV is the founder of the non-profit Agricultural Genomics Foundation. CJS and MJG are owners of Marigene Inc. TCR is an employee of SC Labs Inc. KdC and DL are employees of Steep Hill Labs. NCK is a board member of the non-profit Agricultural Genomics Foundation.

References

1.↵
Li, H. L. An archaeological and historical account of cannabis in China. Econ. Bot. 28, 437–448 (1973).
OpenUrl CrossRef
2.↵
Russo, E. B. Taming THC: Potential cannabis synergy and phytocannabinoid-terpenoid entourage effects. Br. J. Pharmacol. 163, 1344–1364 (2011).
OpenUrl CrossRef PubMed Web of Science
3.↵
Russo, E. B. History of cannabis and its preparations in saga, science, and sobriquet. Chem. Biodivers. 4, 1614–1648 (2007).
OpenUrl CrossRef PubMed Web of Science
4.↵
Schultes, R. E., Klein, M. W., Plowman, T. & Lockwood, T. Cannabis: an example of taxonomic neglect. Harvard Univ. Bot. Museum Leafl. 23, 337–367 (1974).
OpenUrl
5.↵
Hillig, K. W. Genetic evidence for speciation in Cannabis (Cannabaceae). Genet. Resour. Crop Evol. 52, 161–180 (2005).
OpenUrl CrossRef
6.↵
Cherniak, L. The Great Books of Cannabis vol. I, Book II. (1982).
7.↵
de Meijer, E. P. M. et al. The Inheritance of Chemical Phenotype in Cannabis sativa L. Genetics 346, 335–346 (2003).
OpenUrl
8.↵
Habib, R., Finighan, R. & Davenport, S. Testing for Psychoactive Agents. (2013). at http://liq.wa.gov/publications/Marijuana/BOTECreports/1c-Testing-for-Psychoactive-Agents-Final.pdf>
9.↵
Small, E. & Cronquist, A. A practical and natural taxonomy for Cannabis. Taxon 25, 405–435 (1976).
OpenUrl CrossRef Web of Science
10.↵
van Bakel, H. et al. The draft genome and transcriptome of Cannabis sativa. Genome Biol. 12, R102 (2011).
OpenUrl CrossRef PubMed
11.↵
Small, E. Evolution and Classification of Cannabis sativa (Marijuana, Hemp) in Relation to Human Utilization. Bot. Rev. 81, 189–294 (2015).
OpenUrl CrossRef
12.↵
Clarke, R. C. & Merlin, M. D. Letter to the Editor: Small, Ernest. 2015. Evolution and Classification of Cannabis sativa (Marijuana, Hemp) in Relation to Human Utilization. Botanical Review 81(3): 189–294. Bot. Rev. 81, 295–305 (2015).
13.↵
1. R. C. Clarke and
2. M.D. Merlin
Small, E. Response to the Erroneous Critique of my Cannabis Monograph by R. C. Clarke and M.D. Merlin. Bot. Rev. 81, 306–316 (2015).
OpenUrl
14.↵
Clarke, R. C., & Merlin, M. D. in Cannabis Evolution and Ethanobotany 13–26 (2013).
15.↵
Glanzman, A. Discover Himalaya’s Outlawed Marijuana Fields. Time (2015). at http://time.com/3736616/discover-himalayas-illegal-marijuana-fields>
16.↵
Hillig, K. W. & Mahlberg, P. G. A chemotaxonomic analysis of cannabinoid variation in Cannabis (Cannabaceae). Am. J. Bot. 91, 966–975 (2004).
OpenUrl Abstract/FREE Full Text
17.↵
Bauer, R., Salo-Ahen, K. & Bauer, O. CB Receptor Ligands from Plants. Curr. Top. Med. Chem. 8, 173–186 (2008).
OpenUrl PubMed
18.↵
Radwan, M. M. et al. Isolation and characterization of new cannabis constituents from a high potency variety. Planta Med. 74, 267–272 (2008).
OpenUrl PubMed
19.↵
ElSohly, M. a. & Slade, D. Chemical constituents of marijuana: The complex mixture of natural cannabinoids. Life Sci. 78, 539–548 (2005).
OpenUrl CrossRef PubMed Web of Science
20.↵
Poklis, J. L., Thompson, C. C., Long, K. a, Lichtman, A. H. & Poklis, A. Disposition of cannabichromene, cannabidiol, and Δ⁹-tetrahydrocannabinol and its metabolites in mouse brain following marijuana inhalation determined by high-performance liquid chromatography-tandem mass spectrometry. J. Anal. Toxicol. 34, 516–20 (2010).
OpenUrl PubMed
21.↵
Mehmedic, Z. et al. Potency trends of Δ9-THC and other cannabinoids in confiscated cannabis preparations from 1993 to 2008. J. Forensic Sci. 55, 1209–1217 (2010).
OpenUrl CrossRef PubMed Web of Science
22.↵
McPartland, J. M., Matias, I., Di Marzo, V. & Glass, M. Evolutionary origins of the endocannabinoid system. Gene 370, 64–74 (2006).
OpenUrl CrossRef PubMed Web of Science
23.↵
Mechoulam, R. & Gaoni, Y. Recent advances in the chemistry of hashish. Chemie Org. Naturstoffe 25, 175–213 (1967).
OpenUrl
24.↵
Volkow, N. D., Baler, R. D., Compton, W. M. & Weiss, S. R. B. Adverse Health Effects of Marijuana Use. N. Engl. J. Med. 370, 2219–2227 (2014).
OpenUrl CrossRef PubMed Web of Science
25.↵
Berry, E. M., & Mechoulam, R. Tetrahydrocannabinol and endocannabinoids in feeding and appetite. Pharmacol. Ther. 95, 185–190 (2002).
OpenUrl CrossRef PubMed Web of Science
26.↵
Zogopoulos, P., Vasileiou, I., Patsouris, E. & Theocharis, S. E. The role of endocannabinoids in pain modulation. Fundam. Clin. Pharmacol. 27, 64–80 (2013).
OpenUrl CrossRef PubMed
27.↵
Tramèr, M. R. et al. Cannabinoids for control of chemotherapy induced nausea and vomiting: quantitative systematic review. BMJ 323, 16–21 (2001).
OpenUrl Abstract/FREE Full Text
28.↵
Di Marzo, V., Bifulco, M. & De Petrocellis, L. The endocannabinoid system and its therapeutic exploitation. Nat. Rev. Drug Discov. 3, 771–784 (2004).
OpenUrl CrossRef PubMed Web of Science
29.↵
Pacher, P. & Mechoulam, R. Is lipid signaling through cannabinoid 2 receptors part of a protective system? Prog. Lipid Res. 50, 193–211 (2011).
OpenUrl CrossRef PubMed Web of Science
30.↵
De Petrocellis, L. et al. Effects of cannabinoids and cannabinoid-enriched Cannabis extracts on TRP channels and endocannabinoid metabolic enzymes. Br. J. Pharmacol. 163, 1479–1494 (2011).
OpenUrl CrossRef PubMed Web of Science
31.↵
Swift, W., Wong, A., Li, K. M., Arnold, J. C. & McGregor, I. S. Analysis of Cannabis Seizures in NSW, Australia: Cannabis Potency and Cannabinoid Profile. PLoS One 8, 1–9 (2013).
OpenUrl CrossRef PubMed
32.↵
Rustichelli, C., Ferioli, V., Vezzalini, F., Rossi, M. C. & Gamberini, G. Simultaneous separation and identification of hashish constituents by coupled liquid chromatography-mass spectrometry (HPLC-MS). Chromatographia 43, 129–134 (1996).
OpenUrl
33.↵
Devinsky, O. et al. Cannabidiol: Pharmacology and potential therapeutic role in epilepsy and other neuropsychiatric disorders. Epilepsia 55, 791–802 (2014).
OpenUrl CrossRef PubMed
34.↵
Pertwee, R. G. The diverse CB1 and CB2 receptor pharmacology of three plant cannabinoids: delta9-tetrahydrocannabinol, cannabidiol and delta9-tetrahydrocannabivarin. Br. J. Pharmacol. 153, 199–215 (2008).
OpenUrl CrossRef PubMed Web of Science
35.↵
Fellermeier, M., Eisenreich, W., Bacher, A. & Zenk, M. H. Biosynthesis of cannabinoids. Eur. J. Biochem. 268, 1596–1604 (2001).
OpenUrl PubMed
36.↵
Staginnus, C., Zörntlein, S. & de Meijer, E. A PCR marker linked to a THCA synthase polymorphism is a reliable tool to discriminate potentially THC-rich plants of Cannabis sativa L. J. Forensic Sci. 59, 919–26 (2014).
OpenUrl
37.↵
Weiblen, G. D. et al. Gene duplication and divergence affecting drug content in Cannabis sativa. New Phytol. 208, 2141–1250 (2015).
OpenUrl
38.↵
Onofri, C., de Meijer, E. P. M. & Mandolino, G. Sequence heterogeneity of cannabidiolic- and tetrahydrocannabinolic acid-synthase in Cannabis sativa L. and its relationship with chemical phenotype. Phytochemistry 116, 57–68 (2015).
OpenUrl
39.↵
Borrelli, F. et al. Colon carcinogenesis is inhibited by the TRPM8 antagonist cannabigerol, a Cannabis - derived non-psychotropic cannabinoid. Carcinogenesis 35, 2787–2797 (2014).
OpenUrl CrossRef PubMed
40.↵
Izzo, A. a. et al. Inhibitory effect of cannabichromene, a major non-psychotropic cannabinoid extracted from Cannabis sativa, on inflammation-induced hypermotility in mice. Br. J. Pharmacol. 166, 1444–1460 (2012).
OpenUrl CrossRef PubMed
41.↵
Mcpartland, J. M., Duncan, M., Marzo, V. Di & Pertwee, R. G. Are cannabidiol and A 9 - tetrahydrocannabivarin negative modulators of the endocannabinoid system? A systematic review. 737–753 (2015). doi:10.1111/bph.12944
OpenUrl CrossRef PubMed
42.↵
de Meijer, E. P. M., Hammond, K. M. & Sutton, a. The inheritance of chemical phenotype in Cannabis sativa L. (IV): cannabinoid-free plants. Euphytica 168, 95–112 (2009).
OpenUrl CrossRef Web of Science
43.
de Meijer, E. P. M., Hammond, K. M. & Micheler, M. The inheritance of chemical phenotype in Cannabis sativa L. (III): variation in cannabichromene proportion. Euphytica 165, 293–311 (2008).
OpenUrl
44.↵
De Meijer, E. P. M., Hammond, K. M. & Sutton, a. The inheritance of chemical phenotype in Cannabis sativa L. (IV): Cannabinoid-free plants. Euphytica 168, 95–112 (2009).
OpenUrl CrossRef Web of Science
45.↵
Hazekamp, a., & Fischedick, J. T. Cannabis - from cultivar to chemovar. Drug Test. Anal. 660–667 (2012). doi:10.1002/dta.407
OpenUrl CrossRef
46.↵
Sakamoto, K., Akiyama, Y., Fukui, K., Kamada, H. & Satoh, S. Characterization; Genome Sizes and Morphology of Sex Chromosomes in Hemp (Cannabis sativa L.). Cytologia (Tokyo). 63, 459–464 (1998).
OpenUrl CrossRef
47.↵
Sawler, J. et al. The Genetic Structure of Marijuana and Hemp. PLoS One 1–9 (2015). doi:10.1371/journal.pone.0133292
OpenUrl CrossRef
48.↵
Parchman, T. L., Gompert, Z., Mudge, J. & Schilkey, F. D. Genome-wide association genetics of an adaptive trait in lodgepole pine. Mol. Ecol. 21, 2991–3005 (2012).
OpenUrl CrossRef PubMed Web of Science
49.↵
Bolger, A. M., Lohse, M. & Usadel, B. Genome analysis Trimmomatic : a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
OpenUrl CrossRef PubMed Web of Science
50.↵
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
OpenUrl CrossRef PubMed Web of Science
51.↵
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
OpenUrl CrossRef PubMed Web of Science
52.↵
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–8 (2011).
OpenUrl CrossRef PubMed Web of Science
53.↵
Purcell, S. et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
OpenUrl CrossRef PubMed
54.↵
Huson, D. H. & Bryant, D. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267 (2006).
OpenUrl CrossRef PubMed Web of Science
55.↵
Tamura, K., Stecher, G., Peterson, D., Filipski, A. & Kumar, S. MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0. Mol. Biol. Evol. 30, 2725–2729 (2013).
OpenUrl CrossRef PubMed Web of Science
56.↵
Raj, A., Stephens, M. & Pritchard, J. K. fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets. Genetics 197, 573–589 (2014).
OpenUrl Abstract/FREE Full Text
57.↵
Duchesne, P. & Turgeon, J. FLOCK Provides Reliable Solutions to the ‘Number of Populations’ Problem. J. Hered. 103, 734–743 (2012).
OpenUrl CrossRef PubMed Web of Science
58.↵
Pickrell, J. K. & Pritchard, J. K. Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data. PLoS Genet. 8, e1002967 (2012).
OpenUrl CrossRef PubMed
59.↵
Hazekamp, A., Peltenburg, A., Verpoorte, R. & Giroud, C. Chromatographic and Spectroscopic Data of Cannabinoids from Cannabis sativa L. J. Liq. Chromatogr. Relat. Technol. 28, 2361–2382 (2005).
OpenUrl CrossRef Web of Science
60.↵
Clarke, R. C., & Merlin, M. D. in Cannabis Evolution and Ethanobotany 295–309 (2013).
61.↵
Lynch, R. C. Genomics of Adaptation and Diversification. (2015).
62.↵
Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).
OpenUrl Abstract/FREE Full Text
63.↵
Evanno, G., Regnaut, S. & Goudet, J. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Mol. Ecol. 14, 2611–2620 (2005).
OpenUrl CrossRef PubMed Web of Science
64.↵
Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).
OpenUrl CrossRef PubMed Web of Science
65.↵
Janischevsky, D. E. Cannabis Ruderalis. Proc. Saratov 2, 14–15 (1924).
OpenUrl
66.↵
Clarke, R. C., & Merlin, M. D. in Cannabis Evolution and Ethanobotany 311–331 (2013).
67.↵
Gao, C. et al. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers. 9, (2014).
68.↵
Small, E., Pocock, T. & Cavers, P. The biology of Canadian weeds. 119. Cannabis sativa L. Can. J. plant Sci. (2003). at http://pubs.aic.ca/doi/abs/10.4141/P02–021>
69.↵
Gross, B. L. & Zhao, Z. Archaeological and genetic insights into the origins of domesticated rice. Proc. Natl. Acad. Sci. U. S. A. 111, 6190–7 (2014).
OpenUrl Abstract/FREE Full Text
70.↵
Linnaeus, C. Species Plantarum. (1753).
71.↵
Lamarck, J. B. Encyclopédie méthodique: botanique. (1783).
72.↵
Vavilov, N. I., & Bukinich, D. D. Agricultural Afganistan. Bull. Appl. Bot. Genet. Plant Breed. 33, (1929).
73.↵
Rieseberg, L. H., & Willis, J. H. Plant speciation. Science 317, 910–914 (2007).
OpenUrl Abstract/FREE Full Text
74.↵
Di Marzo, V., Bifulco, M. & De Petrocellis, L. The endocannabinoid system and its therapeutic exploitation. Nat. Rev. Drug Discov. 3, 771–784 (2004).
OpenUrl CrossRef PubMed Web of Science
75.↵
Wade, D. T., Makela, P. M., House, H., Bateman, C. & Robson, P. Long-term use of a cannabis-based medicine in the treatment of spasticity and other symptoms in multiple sclerosis. Mult. Scler. 12, 639–645 (2006).
OpenUrl CrossRef PubMed
76.↵
Abrams, D. I. et al. Vaporization as a smokeless cannabis delivery system: a pilot study. Clin. Pharmacol. Ther. 82, 572–578 (2007).
OpenUrl CrossRef PubMed Web of Science
77.↵
Hillig, K. W. A chemotaxonomic analysis of terpenoid variation in Cannabis. Biochem. Syst. Ecol. 32, 875–891 (2004).
OpenUrl
78.↵
Nutt, D. J., King, L. a & Nichols, D. E. Treatment Innovation. 14, 577–585 (2013).

View the discussion thread.

Posted December 13, 2015.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Plant Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5209)
Biochemistry (11730)
Bioengineering (8743)
Bioinformatics (29179)
Biophysics (14964)
Cancer Biology (12080)
Cell Biology (17399)
Clinical Trials (138)
Developmental Biology (9417)
Ecology (14174)
Epidemiology (2067)
Evolutionary Biology (18294)
Genetics (12233)
Genomics (16791)
Immunology (11858)
Microbiology (28051)
Molecular Biology (11575)
Neuroscience (60919)
Paleontology (451)
Pathology (1870)
Pharmacology and Toxicology (3238)
Physiology (4955)
Plant Biology (10422)
Scientific Communication and Education (1682)
Synthetic Biology (2881)
Systems Biology (7338)
Zoology (1650)

[1] 1.↵
Li, H. L. An archaeological and historical account of cannabis in China. Econ. Bot. 28, 437–448 (1973).
OpenUrl CrossRef

[2] 2.↵
Russo, E. B. Taming THC: Potential cannabis synergy and phytocannabinoid-terpenoid entourage effects. Br. J. Pharmacol. 163, 1344–1364 (2011).
OpenUrl CrossRef PubMed Web of Science

[3] 3.↵
Russo, E. B. History of cannabis and its preparations in saga, science, and sobriquet. Chem. Biodivers. 4, 1614–1648 (2007).
OpenUrl CrossRef PubMed Web of Science

[4] 4.↵
Schultes, R. E., Klein, M. W., Plowman, T. & Lockwood, T. Cannabis: an example of taxonomic neglect. Harvard Univ. Bot. Museum Leafl. 23, 337–367 (1974).
OpenUrl

[5] 5.↵
Hillig, K. W. Genetic evidence for speciation in Cannabis (Cannabaceae). Genet. Resour. Crop Evol. 52, 161–180 (2005).
OpenUrl CrossRef

[6] 6.↵
Cherniak, L. The Great Books of Cannabis vol. I, Book II. (1982).

[7] 7.↵
de Meijer, E. P. M. et al. The Inheritance of Chemical Phenotype in Cannabis sativa L. Genetics 346, 335–346 (2003).
OpenUrl

[8] 8.↵
Habib, R., Finighan, R. & Davenport, S. Testing for Psychoactive Agents. (2013). at http://liq.wa.gov/publications/Marijuana/BOTECreports/1c-Testing-for-Psychoactive-Agents-Final.pdf>

[9] 9.↵
Small, E. & Cronquist, A. A practical and natural taxonomy for Cannabis. Taxon 25, 405–435 (1976).
OpenUrl CrossRef Web of Science

[10] 10.↵
van Bakel, H. et al. The draft genome and transcriptome of Cannabis sativa. Genome Biol. 12, R102 (2011).
OpenUrl CrossRef PubMed

[11] 11.↵
Small, E. Evolution and Classification of Cannabis sativa (Marijuana, Hemp) in Relation to Human Utilization. Bot. Rev. 81, 189–294 (2015).
OpenUrl CrossRef

[12] 12.↵
Clarke, R. C. & Merlin, M. D. Letter to the Editor: Small, Ernest. 2015. Evolution and Classification of Cannabis sativa (Marijuana, Hemp) in Relation to Human Utilization. Botanical Review 81(3): 189–294. Bot. Rev. 81, 295–305 (2015).

[13] 13.↵
R. C. Clarke and
M.D. Merlin
Small, E. Response to the Erroneous Critique of my Cannabis Monograph by R. C. Clarke and M.D. Merlin. Bot. Rev. 81, 306–316 (2015).
OpenUrl

[14] R. C. Clarke and

[15] M.D. Merlin

[16] 14.↵
Clarke, R. C., & Merlin, M. D. in Cannabis Evolution and Ethanobotany 13–26 (2013).

[17] 15.↵
Glanzman, A. Discover Himalaya’s Outlawed Marijuana Fields. Time (2015). at http://time.com/3736616/discover-himalayas-illegal-marijuana-fields>

[18] 16.↵
Hillig, K. W. & Mahlberg, P. G. A chemotaxonomic analysis of cannabinoid variation in Cannabis (Cannabaceae). Am. J. Bot. 91, 966–975 (2004).
OpenUrl Abstract/FREE Full Text

[19] 17.↵
Bauer, R., Salo-Ahen, K. & Bauer, O. CB Receptor Ligands from Plants. Curr. Top. Med. Chem. 8, 173–186 (2008).
OpenUrl PubMed

[20] 18.↵
Radwan, M. M. et al. Isolation and characterization of new cannabis constituents from a high potency variety. Planta Med. 74, 267–272 (2008).
OpenUrl PubMed

[21] 19.↵
ElSohly, M. a. & Slade, D. Chemical constituents of marijuana: The complex mixture of natural cannabinoids. Life Sci. 78, 539–548 (2005).
OpenUrl CrossRef PubMed Web of Science

[22] 20.↵
Poklis, J. L., Thompson, C. C., Long, K. a, Lichtman, A. H. & Poklis, A. Disposition of cannabichromene, cannabidiol, and Δ⁹-tetrahydrocannabinol and its metabolites in mouse brain following marijuana inhalation determined by high-performance liquid chromatography-tandem mass spectrometry. J. Anal. Toxicol. 34, 516–20 (2010).
OpenUrl PubMed

[23] 21.↵
Mehmedic, Z. et al. Potency trends of Δ9-THC and other cannabinoids in confiscated cannabis preparations from 1993 to 2008. J. Forensic Sci. 55, 1209–1217 (2010).
OpenUrl CrossRef PubMed Web of Science

[24] 22.↵
McPartland, J. M., Matias, I., Di Marzo, V. & Glass, M. Evolutionary origins of the endocannabinoid system. Gene 370, 64–74 (2006).
OpenUrl CrossRef PubMed Web of Science

[25] 23.↵
Mechoulam, R. & Gaoni, Y. Recent advances in the chemistry of hashish. Chemie Org. Naturstoffe 25, 175–213 (1967).
OpenUrl

[26] 24.↵
Volkow, N. D., Baler, R. D., Compton, W. M. & Weiss, S. R. B. Adverse Health Effects of Marijuana Use. N. Engl. J. Med. 370, 2219–2227 (2014).
OpenUrl CrossRef PubMed Web of Science

[27] 25.↵
Berry, E. M., & Mechoulam, R. Tetrahydrocannabinol and endocannabinoids in feeding and appetite. Pharmacol. Ther. 95, 185–190 (2002).
OpenUrl CrossRef PubMed Web of Science

[28] 26.↵
Zogopoulos, P., Vasileiou, I., Patsouris, E. & Theocharis, S. E. The role of endocannabinoids in pain modulation. Fundam. Clin. Pharmacol. 27, 64–80 (2013).
OpenUrl CrossRef PubMed

[29] 27.↵
Tramèr, M. R. et al. Cannabinoids for control of chemotherapy induced nausea and vomiting: quantitative systematic review. BMJ 323, 16–21 (2001).
OpenUrl Abstract/FREE Full Text

[30] 28.↵
Di Marzo, V., Bifulco, M. & De Petrocellis, L. The endocannabinoid system and its therapeutic exploitation. Nat. Rev. Drug Discov. 3, 771–784 (2004).
OpenUrl CrossRef PubMed Web of Science

[31] 29.↵
Pacher, P. & Mechoulam, R. Is lipid signaling through cannabinoid 2 receptors part of a protective system? Prog. Lipid Res. 50, 193–211 (2011).
OpenUrl CrossRef PubMed Web of Science

[32] 30.↵
De Petrocellis, L. et al. Effects of cannabinoids and cannabinoid-enriched Cannabis extracts on TRP channels and endocannabinoid metabolic enzymes. Br. J. Pharmacol. 163, 1479–1494 (2011).
OpenUrl CrossRef PubMed Web of Science

[33] 31.↵
Swift, W., Wong, A., Li, K. M., Arnold, J. C. & McGregor, I. S. Analysis of Cannabis Seizures in NSW, Australia: Cannabis Potency and Cannabinoid Profile. PLoS One 8, 1–9 (2013).
OpenUrl CrossRef PubMed

[34] 32.↵
Rustichelli, C., Ferioli, V., Vezzalini, F., Rossi, M. C. & Gamberini, G. Simultaneous separation and identification of hashish constituents by coupled liquid chromatography-mass spectrometry (HPLC-MS). Chromatographia 43, 129–134 (1996).
OpenUrl

[35] 33.↵
Devinsky, O. et al. Cannabidiol: Pharmacology and potential therapeutic role in epilepsy and other neuropsychiatric disorders. Epilepsia 55, 791–802 (2014).
OpenUrl CrossRef PubMed

[36] 34.↵
Pertwee, R. G. The diverse CB1 and CB2 receptor pharmacology of three plant cannabinoids: delta9-tetrahydrocannabinol, cannabidiol and delta9-tetrahydrocannabivarin. Br. J. Pharmacol. 153, 199–215 (2008).
OpenUrl CrossRef PubMed Web of Science

[37] 35.↵
Fellermeier, M., Eisenreich, W., Bacher, A. & Zenk, M. H. Biosynthesis of cannabinoids. Eur. J. Biochem. 268, 1596–1604 (2001).
OpenUrl PubMed

[38] 36.↵
Staginnus, C., Zörntlein, S. & de Meijer, E. A PCR marker linked to a THCA synthase polymorphism is a reliable tool to discriminate potentially THC-rich plants of Cannabis sativa L. J. Forensic Sci. 59, 919–26 (2014).
OpenUrl

[39] 37.↵
Weiblen, G. D. et al. Gene duplication and divergence affecting drug content in Cannabis sativa. New Phytol. 208, 2141–1250 (2015).
OpenUrl

[40] 38.↵
Onofri, C., de Meijer, E. P. M. & Mandolino, G. Sequence heterogeneity of cannabidiolic- and tetrahydrocannabinolic acid-synthase in Cannabis sativa L. and its relationship with chemical phenotype. Phytochemistry 116, 57–68 (2015).
OpenUrl

[41] 39.↵
Borrelli, F. et al. Colon carcinogenesis is inhibited by the TRPM8 antagonist cannabigerol, a Cannabis - derived non-psychotropic cannabinoid. Carcinogenesis 35, 2787–2797 (2014).
OpenUrl CrossRef PubMed

[42] 40.↵
Izzo, A. a. et al. Inhibitory effect of cannabichromene, a major non-psychotropic cannabinoid extracted from Cannabis sativa, on inflammation-induced hypermotility in mice. Br. J. Pharmacol. 166, 1444–1460 (2012).
OpenUrl CrossRef PubMed

[43] 41.↵
Mcpartland, J. M., Duncan, M., Marzo, V. Di & Pertwee, R. G. Are cannabidiol and A 9 - tetrahydrocannabivarin negative modulators of the endocannabinoid system? A systematic review. 737–753 (2015). doi:10.1111/bph.12944
OpenUrl CrossRef PubMed

[44] 42.↵
de Meijer, E. P. M., Hammond, K. M. & Sutton, a. The inheritance of chemical phenotype in Cannabis sativa L. (IV): cannabinoid-free plants. Euphytica 168, 95–112 (2009).
OpenUrl CrossRef Web of Science

[45] 43.
de Meijer, E. P. M., Hammond, K. M. & Micheler, M. The inheritance of chemical phenotype in Cannabis sativa L. (III): variation in cannabichromene proportion. Euphytica 165, 293–311 (2008).
OpenUrl

[46] 44.↵
De Meijer, E. P. M., Hammond, K. M. & Sutton, a. The inheritance of chemical phenotype in Cannabis sativa L. (IV): Cannabinoid-free plants. Euphytica 168, 95–112 (2009).
OpenUrl CrossRef Web of Science

[47] 45.↵
Hazekamp, a., & Fischedick, J. T. Cannabis - from cultivar to chemovar. Drug Test. Anal. 660–667 (2012). doi:10.1002/dta.407
OpenUrl CrossRef

[48] 46.↵
Sakamoto, K., Akiyama, Y., Fukui, K., Kamada, H. & Satoh, S. Characterization; Genome Sizes and Morphology of Sex Chromosomes in Hemp (Cannabis sativa L.). Cytologia (Tokyo). 63, 459–464 (1998).
OpenUrl CrossRef

[49] 47.↵
Sawler, J. et al. The Genetic Structure of Marijuana and Hemp. PLoS One 1–9 (2015). doi:10.1371/journal.pone.0133292
OpenUrl CrossRef

[50] 48.↵
Parchman, T. L., Gompert, Z., Mudge, J. & Schilkey, F. D. Genome-wide association genetics of an adaptive trait in lodgepole pine. Mol. Ecol. 21, 2991–3005 (2012).
OpenUrl CrossRef PubMed Web of Science

[51] 49.↵
Bolger, A. M., Lohse, M. & Usadel, B. Genome analysis Trimmomatic : a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
OpenUrl CrossRef PubMed Web of Science

[52] 50.↵
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
OpenUrl CrossRef PubMed Web of Science

[53] 51.↵
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
OpenUrl CrossRef PubMed Web of Science

[54] 52.↵
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–8 (2011).
OpenUrl CrossRef PubMed Web of Science

[55] 53.↵
Purcell, S. et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
OpenUrl CrossRef PubMed

[56] 54.↵
Huson, D. H. & Bryant, D. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267 (2006).
OpenUrl CrossRef PubMed Web of Science

[57] 55.↵
Tamura, K., Stecher, G., Peterson, D., Filipski, A. & Kumar, S. MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0. Mol. Biol. Evol. 30, 2725–2729 (2013).
OpenUrl CrossRef PubMed Web of Science

[58] 56.↵
Raj, A., Stephens, M. & Pritchard, J. K. fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets. Genetics 197, 573–589 (2014).
OpenUrl Abstract/FREE Full Text

[59] 57.↵
Duchesne, P. & Turgeon, J. FLOCK Provides Reliable Solutions to the ‘Number of Populations’ Problem. J. Hered. 103, 734–743 (2012).
OpenUrl CrossRef PubMed Web of Science

[60] 58.↵
Pickrell, J. K. & Pritchard, J. K. Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data. PLoS Genet. 8, e1002967 (2012).
OpenUrl CrossRef PubMed

[61] 59.↵
Hazekamp, A., Peltenburg, A., Verpoorte, R. & Giroud, C. Chromatographic and Spectroscopic Data of Cannabinoids from Cannabis sativa L. J. Liq. Chromatogr. Relat. Technol. 28, 2361–2382 (2005).
OpenUrl CrossRef Web of Science

[62] 60.↵
Clarke, R. C., & Merlin, M. D. in Cannabis Evolution and Ethanobotany 295–309 (2013).

[63] 61.↵
Lynch, R. C. Genomics of Adaptation and Diversification. (2015).

[64] 62.↵
Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).
OpenUrl Abstract/FREE Full Text

[65] 63.↵
Evanno, G., Regnaut, S. & Goudet, J. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Mol. Ecol. 14, 2611–2620 (2005).
OpenUrl CrossRef PubMed Web of Science

[66] 64.↵
Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).
OpenUrl CrossRef PubMed Web of Science

[67] 65.↵
Janischevsky, D. E. Cannabis Ruderalis. Proc. Saratov 2, 14–15 (1924).
OpenUrl

[68] 66.↵
Clarke, R. C., & Merlin, M. D. in Cannabis Evolution and Ethanobotany 311–331 (2013).

[69] 67.↵
Gao, C. et al. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers. 9, (2014).

[70] 68.↵
Small, E., Pocock, T. & Cavers, P. The biology of Canadian weeds. 119. Cannabis sativa L. Can. J. plant Sci. (2003). at http://pubs.aic.ca/doi/abs/10.4141/P02–021>

[71] 69.↵
Gross, B. L. & Zhao, Z. Archaeological and genetic insights into the origins of domesticated rice. Proc. Natl. Acad. Sci. U. S. A. 111, 6190–7 (2014).
OpenUrl Abstract/FREE Full Text

[72] 70.↵
Linnaeus, C. Species Plantarum. (1753).

[73] 71.↵
Lamarck, J. B. Encyclopédie méthodique: botanique. (1783).

[74] 72.↵
Vavilov, N. I., & Bukinich, D. D. Agricultural Afganistan. Bull. Appl. Bot. Genet. Plant Breed. 33, (1929).

[75] 73.↵
Rieseberg, L. H., & Willis, J. H. Plant speciation. Science 317, 910–914 (2007).
OpenUrl Abstract/FREE Full Text

[76] 74.↵
Di Marzo, V., Bifulco, M. & De Petrocellis, L. The endocannabinoid system and its therapeutic exploitation. Nat. Rev. Drug Discov. 3, 771–784 (2004).
OpenUrl CrossRef PubMed Web of Science

[77] 75.↵
Wade, D. T., Makela, P. M., House, H., Bateman, C. & Robson, P. Long-term use of a cannabis-based medicine in the treatment of spasticity and other symptoms in multiple sclerosis. Mult. Scler. 12, 639–645 (2006).
OpenUrl CrossRef PubMed

[78] 76.↵
Abrams, D. I. et al. Vaporization as a smokeless cannabis delivery system: a pilot study. Clin. Pharmacol. Ther. 82, 572–578 (2007).
OpenUrl CrossRef PubMed Web of Science

[79] 77.↵
Hillig, K. W. A chemotaxonomic analysis of terpenoid variation in Cannabis. Biochem. Syst. Ecol. 32, 875–891 (2004).
OpenUrl

[80] 78.↵
Nutt, D. J., King, L. a & Nichols, D. E. Treatment Innovation. 14, 577–585 (2013).