-
Notifications
You must be signed in to change notification settings - Fork 19
CRISPR gRNA Target Design
Much of the excitement around CRISPR systems has centred around their potential for genome editing. However, the use of CRISPR immune systems in eukaryotes is also being explored. A recent study by Seeger & Sohn (2014) targeted Hepatitis B virus in a human cell line. They found that most gRNA targets were able to reduce the number of HBV-positive cells five-fold or more, but mechanism of immunity not entirely clear.
For most gRNA designs, the Cas9 breaks induced short insertions and deletions, likely caused by error-prone Non-Homologous End Joining of the cleaved ends, and the number of indels correlated with the reduction in virulence. One deisng, however, left 55% of cells wild-type while achieving similar reductions. The dynamics of repair and deletion after double-stranded cleavage by Cas9 is a major focus of the mathematical modelling this year.
However, we didn't have time to finish those models before ordering our gRNA constructs, so we decided to design targets for several possible situations.
We chose to target these genes because they have well-established functions related to the dynamics of CaMV inside the plant cell. P2 is not targeted because it is related to interactions with the aphid vector and P4 has a number of isomorphisms that make it somewhat impractical for targeting. You can see more about the functions of each of these genes on the Cauliflower Mosaic Virus wiki page.
P6 trans-activates the 35s mRNA and is responsible for many of the viral defences against the plant cell, making it the obvious choice if only a single gene is targeted. By placing a number of targets on P6, we hope we might disrupt its activity more immediately and induce large deletions when multiple cuts occur at one.
This design lets us examine whether large deletions may be created by flanking a region with Cas9 sites and whether double-stranded breaks alone are descrutive. If CRISPR immunity is mainly conferred by frameshift mutations or small deletions caused by NHEJ, we wouldn't expect this design to offer much reduction in virulence. However, if CRISPR immunity is mainly conferred by degradation of cleaved DNA or large deletions between neighbouring cut sites, we would still expect a significant reduction in virulence. If we see a significant reduction in virulence with this design, we'll have to examine the DNA structure to see if there are large deletions.
Perhaps NHEJ is insufficiently error-prone to eliminate the viruses at the pace we need or that non-destructive mutations introduced by NHEJ will change the DNA sequence enough to prevent prevent Cas9 from recognizing the site for a second attempt . In these cases, suppression of the 35S or 19S promoter using dCas9 might do more to prevent virulence that cleavage using Cas9.
To select the gRNA targets, we ran the CaMV genomic sequence through the Benchling CRISPR design tool, identifying 682 possible targets in the CaMV with NGG pam sites. To narrow down the list of targets, we considered four factors: genome position, conservation, off-targeting and efficiency.
The goal with many of the designs is to target a specific gene or genes and induce frameshift mutations. Previous research (Doench et al. 2014) suggests that any frameshifts within the first 50% of a gene should be deleterious enough to prevent it from producing a functional protein. For design I we used a 40% cutoff after looking at the functional regions of the CaMV transcripts. The targets for Designs II-IV were subset by location (correct protein for II and IV, non-coding for III) but didn't have to all be near the beginning of the gene.
We want to target areas of the genome that are functionally important and assumed this would correspond well to area that are conserved (i.e. unchanging) between CaMV and closely-related viruses. Targeting conserved areas should also make our gRNA targets more robust, since those areas are unlikely to contain many mutations or isomorphisms among different strains of CaMV.
Since we want to provide extra protection for Arabidopsis plants against invading viruses, a major design priority was to avoid cutting apart the Arabidopsis genome by accident. The scoring algorithm
Many regions of the Arabidopsis genome are non-coding, so we also ran a BLAST result of all the off-target matches identified by Benchling. For example, we considered an off-target match in a tDNA insertion site or a putative protein to be less important than one related to a known mRNA or protein.
Someone needs to read the Doench paper though and write this section. Created 4 designs after learning about NHEJ:
Matters for Cas9 (mutations kill both) and dCas9 (overlaps/interference)
To include: links to data along with steps followed (reproducibility!)
- Downloaded all genome sequences in the Caulimovrius genus from NCBI (includes CaMV). These sequences are in the file caulimovirus_sequence.fasta.
- That FASTA was uploaded to Guidance v2 with default parameters to generate a multiple sequence alignment.
- Masked multiple sequence alignment based on base pairs that had at least 0.93% identity across all Caulimovirus. Note that we also tested alignments in the entire Caulimoviridae family, but the sequence homology was so low that there were very few continuous conserved regions.
- Uploaded the masked sequences for CaMV to Benchling, which removed all gaps: read-only link.
- Ran Benchling CRISPR design, using parameters:
- entire masked sequence set from selection
- A. thaliania genome
- Wild-Type Cas9 NGG PAM
- 20 bp guide length
- Genome region None
At this point there are 275 possible targets. Looking at the function of the various genes in the CaMV genome, we decided that we were only interested in targets within gp2,gp4,gp6 or gp7. We are not interested in gp1 because no function has been identified for it, not interested in gp3 because it related to interactions with Aphid vectors and finally not interested in gp5 because there are a number of isomorphisms that make it effortful to reliably target.
After removing PAMs that were partially or wholly masked in the conversation sequence (i.e. that contained 'N' values) and keeping only PAMs within the genes of interest, we were left with 196 possible gRNA targets.
- Kept sequences with efficiency score > 0.6 (calculated by Benchling based on Hsu et al.) specificity score > 0.98 (calculated by Benchling based on Doench et al.).
- Exported as primers and sanity-checked against positions in CaMV genome.
- J.G. Doench, E. Hartenian, D.B. Graham, Z. Tothova, M. Hegde, I. Smith et al. (2014). Rational design of highly active gRNAs for CRISPR-Cas9–mediated gene inactivation. Nature Biotechnology, 32, 1262–1267. doi:10.1038/nbt.3026
- C. Seeger and J.A. Sohn. (2014). Targeting Hepatitis B Virus With CRISPR/Cas9. Molecular Therapy- Nucleic Acids. 3, e216. doi: 10.1038/mtna.2014.68
- P.D. Hsu, D.A. Scott, J.A. Weinstein, F.A. Ran, S. Konermann, V. Agarwala, et al. (2014). DNA targeting specificity of RNA-guided Cas9 nucleases. Nature Biotechnology, 31, 827–832. doi:10.1038/nbt.2647
Project
Description
Results
Design
Requirements
Lab & Design Documentation
Measurement Interlab Study
sgRNA Swap
Target Plasmid Construction
Gibson Assembly of pCAMBIA
Math Modelling
Cauliflower Mosaic Virus (CaMV)
CaMV Spread within Arabidopsis
CRISPR/Cas9 Targeting
Modelling Viral Assembly
Modelling Viral Spread
Bioinformatics/Coding
Coding Guide
Designing sgRNA Targets for CaMV Immunity
PyMOL/PyRosetta for Windows
PyMOL/PyRosetta for Linux
PyMOL/PyRosetta for Mac
Modelling Resources
Modelling Cas9 in PyRosetta
Building PyRosetta from Source
PyRosetta Fold Tree
ABM Software Comparison
Policy & Practices
Survey Information
[Local Agriculture Outreach and Acquiring Virus Testing Facility]([Local Agriculture Outreach and Acquiring Virus Testing Facility](Local Agriculture Outreach and Acquiring Virus Testing Facility))
Teamwide Documentation
Q & A
What does this paper mean?
Outreach
Collaboration
Sponsors