Base mismatch tolerance which is complementary to the target sequences and this is followed by a PAM sequence which activates the endonuclease activity
While the Cas9 specificity is believed to be controlled by the 20nt sgRNA and PAM, off-target mutations are still prevalent and could occur with as many as 3-5 base pair mismatches (out of 20) between the sgRNA and the target DNA sequence. Furthermore, sgRNA secondary structures could also affect cleavage of on-target and off-target sites. As mentioned above, sgRNA consists of a sequence (~20nt) which is complementary to the target sequences and this is followed by a PAM sequence which activates the endonuclease activity. While it was shown that 10-12 nt adjacent to PAM (called the “seed sequence”) was enough for Cas9 specificity, Wu et al. showed that in a catalytically dead Cas9 only 1-5 base pairs of seed sequence is required for specificity. This was later proven by other studies as well. The Cas9 protein binding is further affected by a number of mechanisms:
- The seed sequence determines the frequency of a seed plus PAM in the genome and controls the effective concentration of Cas9 sgRNA complex.
- Uracil-rich seeds are likely to have low sgRNA levels and increase specificity since multiple uracil in the sequence can introduce termination of the sgRNA transcription.
- Mismatches in the 5’ end of the crRNA are more tolerated as the important site would be adjacent to the PAM matrix. Single and double mismatches are also tolerated based on how to place it.
- In a recent study, Ren et al. observed a link between mutagenesis efficiency and GC content of sgRNA. At least 4-6bp adjacent to the PAM are required for a good edit.
- While picking a gRNA, guanine is preferred over cytosine as the first base of the seed adjacent to PAM, cytosine as the first in the 5’ and adenine in the middle of the sequence. This design is based on stability linked to formation of G quadruplexes.
- A ChIP was performed by Kim et al. showcasing that addition of a purified Cas9 along with the sgRNA caused low off target effects which means that there are more factors causing these effects.
It is important to note that DNA methylation of CpG sites reduces efficiency of binding of Cas9 and other factors in cells. Therefore, there is an epigenetic link which will be explored more for the future of epigenome editing.
Drug Designing: Open Access