Lorrie Boucher, Ashton Breitkreutz, Bobby-Joe Breitkreutz, Christie Chang, Andrew Chatr-Aryamontri, Kara Dolinski, Sven Heinicke, Nadine Kolas, Lara O'Donnell, Sara Oster, Rose Oughtred, Jennifer Rust, Adnane Sellam, Chris Stark, Jean Tang, Chandra Theesfeld, Mike Tyers.
The BioGRID was originally published and released as simply the General Repository for Interaction Datasets[2] but was later renamed to the BioGRID[1] in order to more concisely describe the project, and help distinguish it from several GRID Computing projects with a similar name. Originally separated into organism specific databases, the newest version now provides a unified front end allowing for searches across several organisms simultaneously. The BioGRID was developed initially as a project at the Lunenfeld-Tanenbaum Research Institute at Mount Sinai Hospital but has since expanded to include teams at the Institut de Recherche en Immunologie et en Cancérologie at the Université de Montréal and the Lewis-Sigler Institute for Integrative Genomics at Princeton University. The BioGRID's original focus was on curation of binary protein-protein and genetic interactions, but has expanded over several updates[1][3][4][5][6][7][8] to incorporate curated post-translational modification data,[9][10]chemical interaction data, and complex multi-gene/protein interactions. Moreover, on a monthly basis, the BioGRID continues to expand curated data and also develop and release new tools,[9][10][11][12] data from comprehensive targeted curation projects,[13] and perform targeted scientific analysis.[14]
Curation of Genetic, Protein, and Chemical Interactions
The Biological General Repository for Interaction Datasets (BioGRID) is an open access database that houses genetic and protein interactions curated from the primary biomedical literature for all major model organism species and humans. As of 18 October 2020[update],[15] the BioGRID contains 1,928 million interactions as drawn from 63,083 publications that represent 71 model organisms. At the start of 2021 it already contained more than 2,0 million biological interactions, 29,023 chemical-protein interactions, and 506,485 post-translational modifications collectively curated from 75,988 publications for more than 80 species.[16] BioGRID data are freely distributed through partner model organism databases and meta-databases and are directly downloadable in a variety of formats. BioGRID curation is coordinated through an Interaction Management System (IMS) that facilitates the compilation interaction records through structured evidence codes, phenotype ontologies, and gene annotation. The BioGRID architecture has been improved in order to support a broader range of interaction and post-translational modification types, to allow the representation of more complex multi-gene/protein interactions, to account for cellular phenotypes through structured ontologies, to expedite curation through semi-automated text mining approaches, and to enhance curation quality control. Through comprehensive curation efforts, BioGRID now includes a virtually complete set of interactions reported to date in the primary literature for budding yeast (Saccharomyces cerevisiae), thale cress (Arabidopsis thaliana), and fission yeast (Schizosaccharomyces pombe).
Themed Curation Projects
Due to the overwhelming size of published scientific literature containing human (Homo sapiens) gene, protein, and chemical interactions, BioGRID has taken a targeted, project-based approach to curation of human interaction data in manageable collections of high impact data. These themed curation projects represent central biological processes with disease relevance such as chromatin modification, autophagy, and the ubiquitin-proteasome system or diseases of interest including glioblastoma, Fanconi Anemia, and COVID-19. As of 18 October 2020[update],[15] BioGRID themed curation project efforts have resulted in the extraction of 424,631 interactions involving 2,361 proteins from more than 37,000 scientific articles.
Curation of Genome-Wide CRISPR Screens
CRISPR-based genetic screens have now been reported in numerous publications that link gene function to cell viability, chemical and stress resistance, and other phenotypes. To increase the accessibility of CRISPR screen data and facilitate assignment of protein function, BioGRID has developed an embedded resource called the Open Repository of CRISPR Screens (ORCS)[7][15] to house and distribute manually curated, comprehensive collections of CRISPR screen datasets using Cas9 and other CRISPR nucleases. As of 18 October 2020[update],[15] BioGRID-ORCS contains more than 1,042 CRISPR screens curated from 114 publications representing more than 60,000 unique genes across three species human (Homo sapiens), fruit fly (Drosophila melanogaster), and house mouse (Mus musculus) in over 670 cell lines and 17 phenotypes.
Supported Organisms
The following organisms are currently supported within the BioGRID, and each has curated interaction data available according to the latest statistics.