Sep 29, 2017
Jane Ferguson: Hi, everyone. Welcome to Episode 08 of Getting Personal: -Omics of the Heart. I'm Jane Ferguson, and this podcast is brought Circulation Cardiovascular Genetics and the AJ Functional Genomics and Translational Biology Council. This is the September 2017 episode, and this month we delve into some of the newest research coming out in the October 2017 issue of CircGenetics. If you go on to the CircGenetics website at circgenetics.ahajournals.org, you can see the table of contents for the latest issue, and see sneak previews of upcoming papers that are published online in advance of the next issue. You can also more in-depth materials for each paper, like editorials and other resources, so it's a really nice way to keep up with the newest cardiovascular genomics research.
One particularly interesting paper included in the October 2017 issue is entitled "Diminished PRRX1 Expression Is Associated With Increased Risk of Atrial Fibrillation and Shortening of the Cardiac Action Potential," from Elena Dolmatova, Nathan Tucker, Patrick Ellinor, and colleagues. This is a really nice paper which highlights some beautiful approaches used to go from a GWAS hit to functional understanding. This type of research is challenging but really crucial as we move on from the GWAS discovery era, and I recommend you go online to read the whole paper. I talked to the first authors, Elena and Nathan, to find out more about their work.
So I'm here with Doctor Nathan Tucker and Doctor Elena Dolmatova, they're the first authors on a recently published paper. So, welcome and thank you for joining us.
Nathan Tucker: Thank you.
Jane Ferguson: So, for the benefit of our listeners, could you tell us a little bit about yourselves?
Nathan Tucker: Sure, so my name is Nathan Tucker, PhD, researcher, instructor of medicine at Mass General Hospital and the Broad Institute in Boston.
Elena Dolmatova: And my name is Elena Dolmatova, if you could probably tell, I'm Russian by origin, currently I'm a internal medicine resident at Rutgers University and I'm in process of applying for a cardio research fellowship.
Jane Ferguson: And so the two of you co-led a really interesting publication that came out this month, so congratulations on that.
Nathan Tucker: Thank you.
Jane Ferguson: So, some of our listeners may not have had the time yet to read your paper, so I was hoping you could give us just a brief summary of what this publication was about.
Nathan Tucker: Sure, I'd be happy to start. So the focus of this paper, and a lot of the other work that goes on in our group, is genetics of what's the most common cardiac arrhythmia, which, atrial fibrillation. So, really over the past decade or so, once these large Genome-Wide Association Studies have been performed, in order to identify regions that are associated with disease, and then we followed up on that, to try to determine some of the mechanism that underlies those loci. So this is an example of that type of study. So, I think for the vast majority of these regions, and this is not exclusive to our disease at all, but the loci that are associated reside in what we used to refer to as "junk DNA" or intergenic DNA, that we now know is regulatory DNA. But the important point is, we have no ... for the majority of these loci, we have no idea of the mechanism through which they confer risk. So the point of this study was to examine a single locus for atrial fibrillation, which we'll call AF for the rest of this, and try to determine the mechanisms through which is might confer that risk.
So, kind of the start, the study started back in an era where we were using, you know, genotyping chips, and large cohorts of cases and controls to identify variation then impute variants to see what's associated. But we wanted to go into this study with a comprehensive understanding of what's at that locus. So to do that, we performed sequencing and a pretty modest cohort of 500 cases and 500 reference from Framingham Heart Study. And although it didn't really change what we knew about the landscape of that region, we were able to go in with a confident understand of what variants might be associated with disease risk at that locus.
So then Elena really spearheaded a lot of the work to identify which of those variants might be important at the locus, so I'll let her take over from here.
Elena Dolmatova: So, as Nathan mentioned earlier, many of those intergenic regions contained enhancers of regulatory elements and a lot of data was coming up about the genetic loci in the genome. And we wanted maybe to narrow down that region, down to some of the pieces that could be active, or could be functional. So we used the activity markers that [inaudible 00:05:20] modifications and DNA hypersensitivity, to identify those potentially active elements. And then we tested them on zebra fish [inaudible 00:05:31] to see if they're actually active in the heart. When we realized that they are active in the heart, we were able to then do a little bit more targeted [inaudible 00:05:42] after that, identify the ones that are actually differential between the risk and non-risk allele. So in that some of the SNPs can be actually changing the enhancer function. So this is how we actually identified the SNP that was actually functional.
Then next what we wanted to do is to link this enhancer to the gene. And initially we performed a Hi-C analysis, which is a chromatin conformation capture. Which is actually captures a 3D structure of the DNA and shows what regions are interacting with what regions. And we were able to see that this SNP was within the same block as the PRRX promoter. To maybe narrow down and to identify the interaction a little bit better, we performed 3C analysis. That allowed us actually to link the enhancer directly to to PRRX promoter. So, we have the SNP that would change the activity of the enhancer, we have the enhancer linked to the promoter, we wanted to see if the change in the SNP would have any functional consequences on gene expression. And we performed a QTL study.
So what it was, is we looked at the genotype of the SNP and related it to the expression of the genes within that region. And among all the genes that we actually tested, only PRRX1 expression was affected, with the risk allele conferring decreased expression of the gene. However, the consequences of gene decreased PRRX expression were yet to be revealed, and that was part of the critical experiment that Nate focused a lot of his efforts on.
Nathan Tucker: So, we found the gene that was important, we knew the directionality, but a lot of times, with these type of functional genomics where you, which I hope we can elaborate a little bit more later, is that the results given, like, what gene you identify and the direction, aren't as clear as you would sometimes think for a given disease or trait. So, for example, a lot of the coding variation for AF is identified in ion channel genes. It's thought to be an electrophysiological disease. But here we identified a transcription factor, which is what we actually thought to be a developmental transcription factor. So, you kind of went in from a functional angle and say, "Alright, what are the consequences of this alteration?"
So we used two different models, the first was zebra fish, which I had reasonably strong background in. And we knocked the gene down, examined the development of the heart, everything seemed reasonably normal, and then we actually examined the electrophysiology of that heart by optical mapping, and we looked at the action potential duration. Which is basically the cellular phenotype for ... that governs depolarization, re-polarization and thus contraction of any given myosin. And found that that action potential duration in the zebra fish was shorter. We wanted to follow that up and confirm it in a different model, we actually created a CRISPR/Cas9 media knockout of the gene, and embryonic stem cells, and differentiated those into cardiomyocytes, and then saw that similar decrease in action potential duration.
So, kind of altogether, I mean, a paper that spans a lot of different techniques, but what we did, we took associates in locus for a human disease, we found a variant at that locus that seems to drive differential expression of a nearby gene, and then modeled that gene effect in order to give a physiological phenotype that matches with the disease of interest.
Jane Ferguson: Something that struck me, I think you sort of touched on this a little bit earlier, is, you know, the SNP that you end up showing to be causal, are S577676. It's not necessarily the one that you would have picked sort of a priori, by going through the GWAS strength of association, and you know, I know we sort of all know that we shouldn't place too much weight on the specific P value of an association when we're doing GWAS, but I think a lot of the time, that sort of ends up being a screening mechanism, and people look at sort of the strongest SNP and think that's probably going to be the most biologically relevant. But do you think that we're sort of, you know, by relying on this relative strength association the GWAS to pick targets, we're really missing a lot of the potential biology that's underlying these diseases?
Nathan Tucker: The way you look at a normal GWAS locus is, we've always traditionally marked them with what we call a sentinel SNP which is a SNP that's most associated, and then other times, act as though that one might be mediating the function? Whereas in reality, you'll see a block of roughly equivalently associated SNPS that rely or lie within the same linkage to [inaudible 00:10:42] block. And, at least for our cases, when we move forward we really wanted to treat all of those SNPS to be equivalent. And in this one, the SNP that turned out to be functionally active was actually below, a little below that, what we would call that sentinel SNP.
So I think there are a couple different explanations for that. One is, there could be more than one functional variant at a locus, and the LD structure kind of heightens that. The other could be that the sample you're using in order to identify the SNPs of interest or the SNPs that are functionally associated may be biasing you a little bit, particularly with a smaller cohort like this. But I will say, for our SNP, when you look at it in the larger GWAS studies, it's again roughly equivalently associated, is what we'd call a top SNP. So, to answer your question briefly, we always look at all of them. We have to be inclusive when we're trying to find functional variants.
Jane Ferguson: Yeah, no, absolutely agree. And that's one thing I absolutely loved about your paper, was how you, you know, pulled together all these different data types and used as many different resources as you had access to to really tackle this question. So I wonder, out of all of the different things you did, what was the most challenging aspect of this study?
Elena Dolmatova: Well, that was something that nobody's really done before. It was something that there were few studies out by when, the time we started, that would tie some of the GWAS hits with the mechanism of the disease development in [inaudible 00:12:16] in other conditions, but there was really no paved road to take to get an answer to our question.
Nathan Tucker: For me, personally, I mean, I really started this project, which, you know, this project took a considerable amount of time, and I started as a cell biologist, and modeling gene function in zebra fish, and by the end, we ended up using so many different techniques, and integrating so many new types of data into this study, that I don't even know what I would define myself as anymore. So I think it's a, it's challenging to learn how to use all of these new data, and to generate these new data both. That's the kind of, I don't know, that's why we got into this business. That's why people want to do research. So that's, it's challenging, but it's rewarding too.
Jane Ferguson: Absolutely. And so, to look at the converse aspect, then, was there anything that was easier than you expected? You know, did you have a eureka moment where you sort of said, "Yes, now everything is falling into place."?
Nathan Tucker: So, I think, yeah. I've been part of studies where I've really felt that that's happened. And given all of the kind of independent moving parts that were in this study, it was, it's really hard to think of one thing that clicked. You know, every sub-component had its own individual moment where it may have clicked, but really, until they all started, all the pieces of data started to come together, you never really felt that eureka moment. And, you know, I think that's part of what science is in normal ... I mean, this paper was a lot of sweat. And not only mine and Elena's, you have all of our collaborators as well. But I will say, you know, at least using the genetics as a basis, and the GWAS data as a basis, we knew that something was there, going in. We knew that we weren't on some wild goose chase, but really we're filling in a gap knowing that we have a strong basis to build on.
Jane Ferguson: Yeah. It's good to hear from you, sort of that, you know, you had to do all of those experiments, they were all necessary, because I think, a lot of the time, when people are trying to follow up GWAS findings, they're really, I don't know, they sort of have a preconceived idea maybe of what path they want to go down, and I think that's not the answer. I think we have a lot of GWAS hits now, and I think the sort of approach that you did to do all of these different experiments and to just do the hard work that's required to figure this out, I think is really necessary and very laudable.
Nathan Tucker: Thank you.
Jane Ferguson: So, was there anything that surprised you along the way?
Elena Dolmatova: Well, Nathan touched a little bit on that. It was nice to see all the electrophysiological phenotype, that was quite amazing. And the fact that the directionality of the effect was ... fit with what we expected to be, with the risk allele, and how we were able to demonstrate it both in zebra fish and human cells, and they were, again, matching. Seeing how those results could tie to the genetic data and what we know about atrial fibrillation susceptibility, was great and rewarding. I wouldn't call it surprising. More like rewarding. Honestly, we were concerned that we wouldn't be able to observe any physiological phenotype. Because, I mean, we didn't even have a good reason why PRRX would be involved in atrial fibrillation, that was a transcription factor, not an ion channel, like everybody thinks about, everything is an ion channel, by the way, not the same. So it was great that we were actually tie the transcription factor to the disease when we not even quite sure that it would happen.
Jane Ferguson: Yeah. Yeah, and I suppose, you mention the ion channels, and of course there has been several other loci that have been identified for AF, and from your work, how important do you think PRRX1 is, compared to these other loci, and, you know, do you think that this sort of study has to be done for every single one of these loci to really understand what's causing the disease in different people?
Nathan Tucker: First of all, I think the answer to that question depends a little bit on what the person asking it would deem to be important. So, if we're looking at GWAS signals for effect size, generally the effects of each given locus are pretty modest, and PRRX1 locus isn't even at the top for AF. So if you were looking for, like, clinical risk stratification, then it's not going to be the most important locus for AF. But I think what looking at these types of stories does, is identifies novel mechanisms for disease pathogenesis. I think they're often unexpected, it steps outside of the pathway analysis, and candidate gene approaches that have been used in the past. And a really unbiased way to look at, you know, why the disease risk has changed. I think if our ultimate goal is to develop new therapeutics, you know, we don't know which one of these loci might give us that hook into developing the new therapeutic.
So the second part of the question, I guess you'd say, does it need to be done for each locus? So, yes, I guess, given what I just previously said. I think we've invested a lot of time and effort and resources into identifying all these loci, to really, really large discovery efforts, but if we want to really maximize what we've done there, with that discovery effort, then I think we owe it to ourselves as a field to identify mechanism, and see which one of these are going to give us that hook to make that next big clinical therapeutic discovery.
But, that being said, you know, this study, as much as we love it, it was really laborious. And it was a lot of moving parts. And it was a lot of work from a lot of people for a lot of time. And if we're going to have to do this for every locus, not only for AF but for all of the other GWAS that have been performed ... it's just an unacceptably slow rate of discovery, so ... What we've been doing since this one has been completed is, you know, trying to find some higher throughput ways to screen through what might be functional variants, to integrating or generating new transcriptional data sets, so we can better predict what might be the chain at a given locus, and working on our models as well for when we want to look at physiology. So we hope that we can talk more about these briefs soon, a lot of them are in the works, so we'll update soon.
Jane Ferguson: Oh, that's exciting, yeah. And I think you've laid out a really nice blueprint, how you can do these kind of experiments, and how to follow up a locus, and, you know, I'm sure you learned a ton a long the way, and you both mentioned some of these, you probably can't talk about everything you're working on, but I suppose with the benefit of hindsight now, is there anything specific about sort of the study design or the methods along the way that you would change for future studies?
Elena Dolmatova: One of the things that when we started, we started having one toolkit. And when we're finished, we had a completely different toolkit. And it's all because the science is developing every day, so every moment, something new comes up, it's ... In the beginning, there wasn't enough epigenetic data to, for us to guess about the enhancers. And it was coming in almost on a weekly basis, and we were trying, and pretty successful, implementing it, all the knowledge that was acquired and published, almost immediately. We almost had to implement CRISPR/Cas to knockout PRRX in the embryonic stem cells, and they were the five cardiomyocytes, after that it's from them. So all of that knowledge was not there when we started the study. So we actually implemented them almost immediately. But in hindsight, if we had all these tricks up our sleeves back then, of course it would be much more efficient and finish it much faster.
Nathan Tucker: I'll follow up on that, too. It's like, is one thing we learned too, which Elena mentioned, all of epigenomic data sets that were updating, and all the techniques that were updating, I mean I really think one thing that we learned was, our prediction is really only good as the data that we put into it. And I think our plan to learn, particularly for all the other loci, is we really need to understand the epigenomic landscape in relevant to [inaudible 00:21:23] and cells, so, you know, moving towards that first, before screening on what variant or what transcript might be important is a really important step for us, and one that we've used as we've moved forward.
Elena Dolmatova: Mm-hmm (affirmative).
Jane Ferguson: So, what do you think would be the ideal follow-on study to this paper?
Elena Dolmatova: Well, we know that diminished PRRX expression shortens the action potential, but we have little idea about how it is happening. Is it acting through the changing cardiomyocyte state? Is it altering maturation or development of cardiomyocytes? Is it governing ion channel expression? Or maybe changing something with intracellular calcium regulation. Transcription factors can have many targets, and we're not quite, quite sure what the targets are in this particular case, so that would be a nice study, thought that I, to follow up on this study.
Jane Ferguson: So I suppose just to wrap this up, is there any message that you're hoping that readers will be able to take away from your paper?
Nathan Tucker: Sure, I think from, if we're going to look from a disease standpoint, I think the finding regarding the relationship between the gene and atrial fibrillation is important, but I think, I hope we've also illustrated somewhat through this study how complex the genetics of the disease are. I mean, it's ... so much of the focus in the past has been, really, on ion channel regulation, but there's so much more to this condition that can really, is yet to be discovered. So I hope we shed a little bit of light on a path forward for how to uncover some of this other, these other mechanisms, over the next few years.
And then I think, hopefully the other thing, well, at least, that we hope gets relayed through this and other similar studies from other groups, is the importance to fill in this knowledge gap between the population genetics stories, the GWAS studies, and that basic biology. And I think there's a lot of potential for making important discoveries, for human health and clinical intervention, in that space. So hopefully, us and other groups can use some of the things that we did in this paper. And hopefully improve on them, to address this in other GWAS loci, to keep the field moving forward.
Jane Ferguson: Yeah, I couldn't agree more. I think that's a really important message, and I think you've done a fantastic job on sort of starting us down that path, to really translating these GWAS findings into more meaningful biology. So, Elena, Nathan, thank you so much for taking the time to talk to me.
Nathan Tucker: Thanks a lot for having us.
Elena Dolmatova: Thank you.
Jane Ferguson: And that's all for this month. Thank you for listening, and we look forward to getting up close and personal with -Omics of the Heart, and with you, next month.