-
Notifications
You must be signed in to change notification settings - Fork 19
CRISPR Cas9 Targeting
to fix
- Fill in first section
- Where is your references section?
- Possibly combine proposed model overview with previous sections
- HepB Paper, Dong et al., Seeger & Sohn
- Paper demonstrating that messing w/ P6 can slow viral spread (Love et al.)
Quoting from Addgene's CRISPR: A Practical Guide:
The Cas9 nuclease has two functional domains: RuvC and HNH, each cutting a different DNA strand. When both of these domains are active, the Cas9 causes double strand breaks (DSBs) in the genomic DNA. In the absence of a suitable repair template, the DSB is repaired by the Non-Homologous End Joining (NHEJ) DNA repair pathway. During NHEJ repair, InDels (insertions/deletions) may occur as a small number of nucleotides are either inserted or deleted at random at the DSB site. InDels alter the Open Reading Frame (ORF) of the target gene, which may significantly change the amino acid sequence downstream of the DSB. Additionally, InDels could also introduce a premature stop codon either by creating one at the DSB or by shifting the reading frame to create one downstream of the DSB. Any of these outcomes of the NHEJ repair pathway can be leveraged by scientists to disrupt their target gene. It is important to note that the InDels induced by NHEJ will be random, so the type and extent of gene disruption will need to be determined experimentally.
In order to maximize the effect of gene disruption, target sequences should be chosen near the N-terminus of the coding region of the gene of interest (Learn more about Designing gRNAs). Typically, the target sequence is selected to introduce a DSB within the first or second exon of the gene. It is important not to design targets to introns (non-coding regions), as repair of the DSB in that region will not disrupt the target gene. The changes introduced by this use of the CRISPR system are permanent to the genomic DNA of the organism.
Non Homologous End Joining (NHEJ) is the process of repairing double stranded breaks in the DNA without the presence of a homologous chromosome. This mechanism exists in both prokaryotes and eukaryotes however the proteins involved are different. Multicellular eukaryotes (not yeast) involve the proteins: Ku70/80, Pol μ/λ, Artemis:DNA-PKcs, PNK and XRCC4 (Lieber, 2010).
Li et al. proposes a second order ODE model. As such, it makes assumptions of Large Numbers and Fast Diffusion for its construction. Still, the rate parameters can be used as an estimate of probabilities of repair given a cut.
- The model involves:
- The binding of Ku70/80 and Artemis to each of the broken ends.
- The formation of a synapse.
- Repair via binding of XRCCR DNA Ligase.
- The authors used data from previous papers to determine parameters of the system. The work they reference (Reynolds et al.) discusses the apparent half-life of repair, approximately 8min with the limiting reaction being the ligasing. However, Ku70/80 binding occurs at a much faster rate on the order of a minute.
- It is important to note that in these papers, all cuts are performed using radiation and therefore do not reflect the nature of a CRISPR/Cas9 cut.
In general, papers distinguish between two types of cuts: complex and simple. The simple ones repair fairly quickly via NHEJ however the more complicated ones repair at a much slower rate. It is suggested that the complex cuts involve overhang; CRISPR/Cas9 involves clean cuts and therefore it is more likely to be repaired via the fast acting mechanism.
In type II CRISPR systems, the type of system we are using in Arabidopsis, the endonuclease Cas9 is the protein mediating the cleavage of the double-stranded DNA. It has, in numerous studies, proven to be a rapid and effective method of cutting DNA. However, if our model is to be useful, a more detailed understanding of the dynamics of Cas9 is required.
An important first point to mention is that there is evidence Cas9 does not behave following typical Michaelis-Menten kinetics. Rather, Sternberg et al. found that concentrations plateaued at the ratio of Cas9-RNA to DNA and further that the Cas9, following a cut, remains bonded to the cut DNA site on both strands. We must consider this in particular, as it means that in some sense the Cas9 which cuts DNA is “used up” and unavailable, at least for a time, to make further cuts in the nucleus. As a result certain well understood models cannot be applied here. Unfortunately, a significant amount of searching has not found any viable models for this system or any parameters involving the typical timescales of cuts and bonding events. This lack of quantitative characterization is currently a significant and important open avenue of investigation.
Still, its activities are well understood. The Cas9 protein requires two guide RNA pieces, the CRISPR RNA (crRNA) and a trans-activating tracrRNA. Once all the components are assembled, Cas9 then simply diffuses in three dimensions until finding PAM sites. In the case we are considering these sites are simple three base pair guides of the form NGG. If the complex binds to DNA without a PAM site, it rapidly decouples and continues searching for a site. Following the recognition of PAM sites on a DNA strand, the Cas9-RNA seems to rely upon thermal energy and enzymatic activation by the PAM site to decouple the DNA strands before moving along searching for a sequence matching the guide RNA, typically adjacent to the PAM site. Unfortunately, due to the duplicity of many different sequences within a genome, as well as the possibility for the guide RNA to bond to sequences which are not quite exact, we expect some off-target effects. It is likely we will experience Cas9 cutting where it shouldn’t. While we chose sgRNA targets deliberately to minimize this risk, in particular to minimize the commonalities of chosen targets to the Arabidopsis genome, these effects will still be present and will reduce the efficiency of the Cas9 system in targeting the viral DNA.
After a cut has occurred, the plant will try to repair the viral DNA through non-homologous end joining. This process can result in a number of different outcomes. Identifying these is a crucial first step in forming the model since they define the level of damage inflicted on the virus.
- The first outcome is an "indel" resulting in a frameshift. CRISPR cuts have been shown to produce small insertions or deletions during the repair process (Seeger and Sohn) and the resulting frameshifts have been observed to damage CaMV (Love at al.) These indels vary in size from 1 to over 30 nucleotides and are likely do to errors in the NHEJ process.
- The second outcome is no frameshift. This happens either when the cut is perfectly repaired or when there is an indel of a multiple of three nucleotides. This will not alter the reading frame and, unless there is a sufficiently large deletion, will presumably not harm the virus.
- A third outcome is a large deletion. Since we are targeting multiple sites on the viral genome, it is possible that two sufficiently close sites will be cut simultaneously and their ends joined together. This would remove the entire region between the cuts which should prevent it from producing working proteins.
- Finally, after a cut has been repaired, if the resulting sequence is sufficiently similar to the original sequence (small or no indel), the Cas9 may be able to target it again. This may either induce a frameshift where there wasn't one or potentially repair a frameshift by inserting or deleting a complementary number of nucleotides.
It should be noted that the Cas9 protein may not unbind quickly after making a cut. This could slow or alter the NHEJ process or perhaps prevent other Cas9s from binding to the target. This unbinding process will be an important area of research for our model.
Given the above description of the models involved, it is clear that a fairly complicated simulator needs to be constructed in order to gather results. The simulation is multi-stage.
-
First, CRISPR/Cas9 initial conditions are set in the various compartments of the plant cell (nucleus, cytoplasm) and are assumed constant. This can be discovered using the CRISPR/Cas9 model suggested. Once determined, the initial viral genome (default) will enter the nucleus and a stochastic model solver is used to simulate various scenarios of interaction between the viral genome and CRISPR/Cas9.
-
The probability of CRISPR/Cas9 performing a cut is dependant on the number of target sites on the genome as well as the steady-state concentration. As soon as a cut is performed, the stochastic solver can predict (simulate) when Ku70/80 (part of the NHEJ process) will bind and begin the repair process.
Note that before this repair process occurs, a CRISPR/Cas9 complex can still come in and perform a slicing action, which may in turn cause a large deletion! The simulate must account for this action in combination with NHEJ repairing this deletion by just combining the two larger components.
- At the timestep that a repair is predicted to occur, the DNA is repaired with some probabilistically determined indel as described by papers above. Once this insertion occurs, a snapshot of the system can be taken to describe what state the virus genome is at.
Note that this new genome will have slightly differing binding probabilities for Cas9. This simulation should continue for as long as possible; the snapshots can be used to determine the possible genomes after targetting the virus over and over again and can be passed on to the Protein group in order to asses viability. Once viability is determined, the expected length of time before Cas9 kills a virus can be evaluated.
The team has decided upon the use of the following tools to perform the necessary simulation:
- Python 2.7.x
- Numpy - for Python 2.7.x
- MATLAB - for ODE modelling, version shouldn't matter
- matplotlib - for rendering via python
- OUTPUT = valid genomic transcripts created over time (include valid in terms of which genes)
- Start with a viral genome with all proteins functional. By putting the viral through
- Hand off abundance of functional mRNA transcripts. Could either hand off average effects of CRISPR (time it takes for genome to lose functionality) or can actually simulate both at the same time with fun stochasticity
Some assumptions/notes:
- Plant cell in steady-state w.r.t. Cas9 and sgRNA and Ku80 pathway members, so assume all relevant concentrations (except of genomic transcripts) are fixed
- Viral genomes are not in steady-state, may need to look at them discreetly
- Given steady state, there is some probability that Cas9-sgRNA complex will cut the target. Need to consider whether changes to the target (i.e. natural mutations over time, mutations caused by previous cuts/joins)
- Should check off-targeting on the viral genome. Off-targeting on plant genome is insignificant enough that we don't need to consider losing Cas9 to binding off-target DNA in the plant. However, may need to consider off-targeting in less deadly areas of viral genome.
- Following a cut: how long to repair? Does deletion occur? Repair = indel histogram, on a particular timescale
- Continuous process -> poisson distribution
Project
Description
Results
Design
Requirements
Lab & Design Documentation
Measurement Interlab Study
sgRNA Swap
Target Plasmid Construction
Gibson Assembly of pCAMBIA
Math Modelling
Cauliflower Mosaic Virus (CaMV)
CaMV Spread within Arabidopsis
CRISPR/Cas9 Targeting
Modelling Viral Assembly
Modelling Viral Spread
Bioinformatics/Coding
Coding Guide
Designing sgRNA Targets for CaMV Immunity
PyMOL/PyRosetta for Windows
PyMOL/PyRosetta for Linux
PyMOL/PyRosetta for Mac
Modelling Resources
Modelling Cas9 in PyRosetta
Building PyRosetta from Source
PyRosetta Fold Tree
ABM Software Comparison
Policy & Practices
Survey Information
[Local Agriculture Outreach and Acquiring Virus Testing Facility]([Local Agriculture Outreach and Acquiring Virus Testing Facility](Local Agriculture Outreach and Acquiring Virus Testing Facility))
Teamwide Documentation
Q & A
What does this paper mean?
Outreach
Collaboration
Sponsors