This interview is with Kevin McKernan who sequenced the cannabis genome.

Kevin McKernan, chief scientific officer of Medicinal Genomics, speaks on sequencing the cannabis genome and its implications for the cannabis industry and "personalized medicine".
Written transcript is below:
Project CBD: We’re speaking with Kevin McKernan, chief scientific officer of Medicinal Genomics, a Massachusetts-based company which a few years ago – I believe in 2011 – sequenced the cannabis genome or a particular cannabis strain.
McKernan: Yes, that’s right.
Project CBD: This would seem to have some significant implications for the cannabis industry as a whole. We’d like to explore that with you. Maybe you could explain a little bit about what we mean by “sequencing the cannabis genome”?
McKernan: Sure. So, this was in 2011 and the tools we had to sequence back then were still evolving very, very quickly, but we were able to get a very draft version of this genome sequence. Now what this is, is reading every letter in the genome and really, in any cannabis samples, many know there are two genomes: there’s really the mother and father genome. The plant is known to be diploid. So it’s got 20 chromosomes and one copy of each from mother and father, there are some chloroplasts and mitochondria genome in there as well, but we want to read all of those letters so we can begin to build a map of all of the genetics that might predict cannabinoid expression, terpene expression, maybe even flavonoid expression. As it moves into hemp, maybe some of the genes that are governing either seed size and oil and fiber.
So to read the entire genome, it’s about 1 billion bases long, it’s a billion letters of genetic code. In the process, with the technology we have, we probably only got about 400-500 megabases of really nicely aligned sequence. Now that seems like it, you’ve only gotten about half of the genome. That’s probably true. There’s a lot of repetitive nature in plant genomes. They have copies of things that are identical scattered throughout them. And those end up when you sequence them, it’s kind of like putting together a big jigsaw puzzle. Those are like all the pieces that look the same. And sometimes you don’t know exactly where they go. And so they get – they’re in the sequence but they get kind of left as ambiguous when you put all this data together. But we do have is a really nice scaffolds of some of the genes everyone is very attentive to, like THC synthase is one that is of real interest, really nice sequence coverage of those, CBDA synthase and some of the genes that are governing cannabinoids.
Project CBD: When you say “synthase” you’re talking about the kind of the precursor gene for what will become CBD or the gene that encodes the enzyme that creates CBDA and THCA?
McKernan: Yeah, so it’s an enzyme involved that the DNA codes for protein, and that protein folds into little enzymes that folds the precursor molecule into either THC – and there’s another gene called CBDA synthase, and the “A” is for the acid form because it makes THCA and CBDA before it, in the plant form. That’s taking a Cannabigerol precursor, and if you look at Cannabigerol, it’s like a ring structure and it’s got a long tail on it that it kinds of folds and wraps into two more cyclical groups that either makes THC, or one more cyclical group can make CBD. And there’s two different genes responsible for folding that precursor two different ways, and they’re actually in competition for the precursor.
