CRISPR-Cas is a tool that allows scientists to make targeted changes to an organism’s DNA. This tool consists of two parts. The first is a microscopic substance called a nuclease Cas protein (e.g., Cas9) that can cleave DNA. The second is an RNA molecule (also called guide RNA or gRNA) that determines where these edits are made. By studying the biology and chemistry of how CRISPR-Cas functions, scientists can predict and design where DNA modifications will occur. However, these predictions often fail because there is large variation in genome structure and composition among different organisms (for example, in humans versus bacteria). This limits how scientists can use the CRISPR-Cas tool. To address this problem, researchers used artificial intelligence to better predict the tool’s behavior. The approach used a novel set of quantum chemical properties. These properties apply the rules of quantum mechanics to molecules to better understand how molecules interact. This improved the accuracy of predicting where CRISPR-Cas genome engineering might occur.
This study used an approach called explainable-artificial intelligence (XAI) to identify new biological features. It aimed at understanding the design of guide RNA design and the association of guide RNA with CRISPR-based genome edits. This could improve scientists’ ability to efficiently predict where genomic targets will occur in a genome.
Scientists rely on models to predict where CRISPR-Cas tools act on an organism’s genome. The performance of these models is critically important because these modifications are irreversible. This study by scientists at Oak Ridge National Laboratory and the University of Tennessee, Knoxville aimed to improve the reliability of these tools by using explainable-artificial intelligence to uncover new relationships between the guide RNA, an organism’s DNA, and the activity of CRISPR-based tools. The researchers used publicly accessible datasets to train an explainable artificial intelligence model called iterative Random Forest to predict how efficiently CRISPR-Cas9 can edit specific DNA sequences with a specific guide RNA. Using this approach, the researchers discovered that quantum chemical features had the most significant effect on predicting guide RNA efficiency in both H. sapiens and E. coli. Moreover, the researchers found that the importance of different quantum chemical properties or locations of interest varied with each species. This research underscores the importance of future research in this field to improve the safety and reliability of CRISPR-Cas tools in non-model organisms.
Boris Wawrik, Program Manager
Department of Energy Office of Science, Biological and Environmental Research
Dan Jacobson, Principal Investigator
Biosciences Division, Oak Ridge National Laboratory
The research was supported by the Secure Ecosystem Engineering and Design project funded by the Genomic Science Program of the Department of Energy Office of Science, Office of Biological and Environmental Research program as part of the Secure Biosystems Design Science Focus Area. One of the researchers was supported by the Center for Bioenergy Innovation, a. Department of Energy Research Center
Noshay, J.M., et al., Quantum biological insights into CRISPR-Cas9 sgRNA efficiency from explainable-AI driven feature engineering. Nucleic Acids Research 51, 19 (2023). [DOI: 10.1093/nar/gkad736]