1. Help Center
  2. Genomic Services

How to read your StrainSEEK® Report

This article will help customers to better understand what they are getting in their StrainSEEK® report and where to locate said information.

Screen Shot 2020-01-13 at 8.01.26 AM

General Information –

Strain: Name of the strain provided by the customer

RSP ID: The number we assign as the strains are added to our repository

Grower: Name of the customer

Accession Date: Date the strain was submitted to Kannapedia

Gender: Male or Female

StrainSEEK® Version: We are constantly doing bioinformatics work to find new areas of interest in the genome. As more and more areas of the genome are correlated to beneficial phenotypes, Medicinal Genomics will continue to update the strainSEEK® panels. We are currently on V2. For information on comparing v1 to v2 data, see this help page: https://help.medicinalgenomics.com/comparing-strainseekv2

Screen Shot 2020-01-13 at 8.09.21 AM

Is my strain special?

Strain Rarity Graph (Violin Graph):

The strain rarity visualization shows how distant the strain is from the other cultivars in the Kannapedia database. The y-axis represents genetic distance, getting farther as you go up. The width of the visualization at any position along the y-axis shows how many strains there are in the database at that genetic distance. So, a common strain will have a more bottom-heavy shape like a pair, while uncommon and rare cultivars will have a visualization that is generally shifted towards the top.

What makes it rare is the frequency of the variants that we see in the genome. Right now, we are looking at around 600 different SNPs throughout the genome, depending on whether there are different set of nucleotide base pairs compared to what we’ve seen before tells us the rarity of the strain.

Chemical Information-

We always ask that people supply us with this information. We do not do metabolite analysis in house, so we ask that the customer provides that information.

Screen Shot 2020-01-13 at 8.10.56 AM

Genetic Information-

Percent Heterozygosity: Is the percentage of variants that are different between the parent and the daughter plant.

Download VCF file: VCF stands for Variant Call Format. VCF files are simply a text file that contains a header with all the metadata about the strain, then a list of the variant calls and the location they fall in the genome. The VCF files can be opened in a text editor. TextEdit on Mac OS or terminal on Mac OS work the best. Along with the variant calls and location in the genome there is also information on the sequencing depth of that variant. The indicates how many times that variant was seen, if there is only one read with that change it might not be real, but if we have 25 reads with that SNP then we can be sure it is real.

Download FastQ Files: FastQ files are the raw data reads from the sequencing run. These reads are aligned and compared to the reference genome.

Plant Type: Plant type classifies the cannabis plant depending on the type of cannabinoid and the concentration produced. It is tied into the Bt/Bd allele coverage. So far, we have identified four main types:

Type I: THCA dominate plants, produce mainly THCA. Type II: THCA/CBDA hybrids, can produce a 1:1 THCA: CBDA chemical profile. Type III: CBDA dominate plants, very little THCA production. Type IV: CBGA dominate plant, produces little to no THCA and CBDA.

Screen Shot 2020-01-13 at 8.02.47 AM

Bt/Bd Allele Coverage: This chart represents the Illumina sequence coverage over the Bt/Bd allele. These are the three regions in the cannabis genome that impact THCA, CBDA, CBGA production. Coverage over the active CBDAS gene is highly correlated with Type II and Type III plants as described by Etienne de Meijer. Coverage over the THCA gene is highly correlated with Type I and Type II plants but is anti-correlated with Type III plants. Type I plants require coverage over the inactive CBDA loci and no coverage over the Active CBDA gene. Lack of coverage over the Active CBDA and Active THCA allele are presumed to be Type IV plants (CBGA dominant).

Screen Shot 2020-01-13 at 8.03.10 AM

Heterozygosity: Heterozygosity in very important for breeders. Heterozygosity shows how different a plant is from its parents. The less heterozygous means more similarity between the parents and offspring which is sometimes referred to as stability. Breeders will back cross lines with one another in order to decrease heterozygosity and increase the stability of that line.

Screen Shot 2020-01-13 at 8.13.43 AM

Nearest Genetic Relatives Dot Plot: This dot plot which shows the 20 cultivars in our database that are the closest genetics relatives to the submitted strain. This can be useful information for breeders who are interested in creating new and interesting cultivars. It can also be helpful for potentially identifying if you have a clonal line on something else. Breeding with genetically distant cultivars will produce diverse phenotypes, which will likely have high heterozygosity or hybrid vigor. This report is prepared is similar to the rarity report.

Screen Shot 2020-01-13 at 8.15.19 AM

Most Genetically Distant Strain Dot Plot: This dot plot which shows the 10 cultivars in our database that are most distant. This can be useful information for breeders who are interested in creating new and interesting cultivars. It can also be helpful for potentially identifying if you have a clonal line on something else. Breeding with genetically distant cultivars will produce diverse phenotypes, which will likely have high heterozygosity or hybrid vigor. This report is prepared is similar to the rarity report.