Abstract

The Biological General Repository for Interaction Datasets (BioGRID: http//thebiogrid.org) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species. As of September 2012, BioGRID houses more than 500 000 manually annotated interactions from more than 30 model organisms. BioGRID maintains complete curation coverage of the literature for the budding yeast Saccharomyces cerevisiae, the fission yeast Schizosaccharomyces pombe and the model plant Arabidopsis thaliana. A number of themed curation projects in areas of biomedical importance are also supported. BioGRID has established collaborations and/or shares data records for the annotation of interactions and phenotypes with most major model organism databases, including Saccharomyces Genome Database, PomBase, WormBase, FlyBase and The Arabidopsis Information Resource. BioGRID also actively engages with the text-mining community to benchmark and deploy automated tools to expedite curation workflows. BioGRID data are freely accessible through both a user-defined interactive interface and in batch downloads in a wide variety of formats, including PSI-MI2.5 and tab-delimited files. BioGRID records can also be interrogated and analyzed with a series of new bioinformatics tools, which include a post-translational modification viewer, a graphical viewer, a REST service and a Cytoscape plugin.

INTRODUCTION

The architecture and function of cellular interaction networks underpin the complex behavior of living systems. The network responses to both internal cues and exogenous stimuli and how environmental and/or genetic perturbations affect these responses are critical for understanding the molecular basis of human disease (1–3). Significant efforts have been made to chart the interaction networks of model organisms (4–7), based on advances in experimental techniques that allow the systematic exploration of biological interactions, both in vivo and in vitro (8,9). The integration of these various experimental datasets has begun to enable computational models of cellular interaction networks and the prediction of individual gene function in the regulation of cellular physiology.

The systematic curation of biological data, including protein and genetic interactions, is essential for computational biology and for the interpretation of genetic variation and disease associations revealed by genome-sequencing efforts (10,11). Biological interaction databases allow curated experimental datasets that would otherwise be dispersed in the biomedical literature to be accessed and exploited. These databases thus act as central repositories that provide a wealth of interaction data in a unified and common format, and thereby facilitate the exploration, visualization and integrative analysis of biological interaction networks. The Biological General Repository for Interaction Datasets (BioGRID: http://thebiogrid.org) is an open access database committed to the annotation of genetic and physical interactions between genes or gene products across all major model organism species. BioGRID is now a widely used resource that provides interaction datasets directly to the biological and computational communities, as well as to several model organism database (MOD) partners. BioGRID data records can be used by the biomedical research community to generate and explore specific hypotheses about gene and network function, and as a benchmark for newly generated experimental high-throughput datasets.

DATA CONTENT AND ACCESS

Since our 2011 NAR Database report (12), the number of interactions curated and amassed in BioGRID has increased by >30%. As of September 2012 (version 3.1.92), BioGRID contains 527 569 protein and genetic interactions, of which 360 375 are non-redundant interactions. These interactions correspond to 309 819 (209 354 non-redundant) protein interactions and 217 750 (157 849 non-redundant) genetic interactions (Table 1). The data were directly extracted from 33 858 manually annotated peer-reviewed publications, which were identified from the corpus of biomedical literature by keyword searches, text-mining approaches and manual inspection of candidate abstracts. All BioGRID interaction records are directly mapped to experimental evidence in the supporting publication, as classified by a structured set of evidence codes (12).

Table 1.

Increase in BioGRID data content since 2011 NAR Database Update

OrganismTypeAugust 2010 (3.0.67)
September 2012 (3.1.92)
NodesEdgesPublicationsNodesEdgesPublications
Arabidopsis thalianaPI17354719747591516 4761118
GI881745510718862
Caenorhabditis elegansPI28134663122927501093
GI1030211251109232622
Drosophila melanogasterPI739624 480167799835 843314
GI98299941466102399341468
Homo sapiensPI946748 36810 20314 896123 43617 134
GI47946317812911609237
Saccharomyces cerevisiaePI578390 76954446003114 5066601
GI5357146 08156065561189 6926686
Schizosaccharomyces pombePI1441401976917736019968
GI134011 527953190714 0151158
Other organismsALL22882985830843515 9782724
TotalALL30 665347 96623 45144 515527 56933 858
OrganismTypeAugust 2010 (3.0.67)
September 2012 (3.1.92)
NodesEdgesPublicationsNodesEdgesPublications
Arabidopsis thalianaPI17354719747591516 4761118
GI881745510718862
Caenorhabditis elegansPI28134663122927501093
GI1030211251109232622
Drosophila melanogasterPI739624 480167799835 843314
GI98299941466102399341468
Homo sapiensPI946748 36810 20314 896123 43617 134
GI47946317812911609237
Saccharomyces cerevisiaePI578390 76954446003114 5066601
GI5357146 08156065561189 6926686
Schizosaccharomyces pombePI1441401976917736019968
GI134011 527953190714 0151158
Other organismsALL22882985830843515 9782724
TotalALL30 665347 96623 45144 515527 56933 858

Data drawn from monthly release 3.0.67 and 3.1.92 of BioGRID. Nodes refer to gene or proteins, edges refer to interactions. PI, protein interaction; GI, genetic interaction.

Table 1.

Increase in BioGRID data content since 2011 NAR Database Update

OrganismTypeAugust 2010 (3.0.67)
September 2012 (3.1.92)
NodesEdgesPublicationsNodesEdgesPublications
Arabidopsis thalianaPI17354719747591516 4761118
GI881745510718862
Caenorhabditis elegansPI28134663122927501093
GI1030211251109232622
Drosophila melanogasterPI739624 480167799835 843314
GI98299941466102399341468
Homo sapiensPI946748 36810 20314 896123 43617 134
GI47946317812911609237
Saccharomyces cerevisiaePI578390 76954446003114 5066601
GI5357146 08156065561189 6926686
Schizosaccharomyces pombePI1441401976917736019968
GI134011 527953190714 0151158
Other organismsALL22882985830843515 9782724
TotalALL30 665347 96623 45144 515527 56933 858
OrganismTypeAugust 2010 (3.0.67)
September 2012 (3.1.92)
NodesEdgesPublicationsNodesEdgesPublications
Arabidopsis thalianaPI17354719747591516 4761118
GI881745510718862
Caenorhabditis elegansPI28134663122927501093
GI1030211251109232622
Drosophila melanogasterPI739624 480167799835 843314
GI98299941466102399341468
Homo sapiensPI946748 36810 20314 896123 43617 134
GI47946317812911609237
Saccharomyces cerevisiaePI578390 76954446003114 5066601
GI5357146 08156065561189 6926686
Schizosaccharomyces pombePI1441401976917736019968
GI134011 527953190714 0151158
Other organismsALL22882985830843515 9782724
TotalALL30 665347 96623 45144 515527 56933 858

Data drawn from monthly release 3.0.67 and 3.1.92 of BioGRID. Nodes refer to gene or proteins, edges refer to interactions. PI, protein interaction; GI, genetic interaction.

BioGRID curation is focused on the parallel approaches of model organism-oriented curation and themed curation in human biology and disease. In addition to housing curated interaction data for more than 30 organisms, BioGRID has achieved exhaustive annotation of the literature for the budding yeast Saccharomyces cerevisiae (304 198 interactions), the fission yeast Schizosaccharomyces pombe (20 034 interactions) and the model plant Arabidopsis thaliana (16 664 interactions) (Table 1). These datasets are updated monthly and are directly linked from the respective MODs, Saccharomyces Genome Database (SGD) (13), PomBase (14) and The Arabidopsis Information Resource (TAIR) (15).

The complete manual annotation of all human interaction data documented in the biomedical literature remains a daunting task due to the sheer number of potentially relevant publications, now well in excess of 12 million papers in PubMed. To enable meaningful insights into human interaction networks, we have undertaken comprehensive curation of interactions in particular areas of biomedical interest. Current focused projects include central signaling conduits implicated in development and disease, such as the target of rapamycin (TOR), Wnt and TGF-β networks, disease-centric networks in breast cancer and HIV, and vital global processes such as the chromatin modification (CM) (16) and ubiquitin–proteasome systems (UPS). For example, the complex network of chromatin modifications that controls gene expression is dictated by at least 470 human genes annotated by the Gene Ontology (GO) process term ‘chromatin remodelling’ (16). Based on searches and text mining with this gene set, we recently curated more than 15 000 prioritized publications to yield 57 141 protein interactions from 7561 papers. In another example of a global cellular function, conjugation of the small conserved protein ubiquitin to myriad substrates controls the stability, activity and localization of most of the proteome (17). We manually annotated a set of 1140 genes that mediate the core functions of the UPS, including E1, E2 and E3 enzymes, deubiquitinating enzymes, ubiquitin-binding domain proteins, and proteasome core and auxiliary subunits. We have currently curated more than 5800 publications that bear evidence for 48 679 interactions (24 400 non-redundant interactions) in the UPS. These and other anticipated themed datasets will facilitate the prediction of individual gene function and network behavior within the major cellular regulatory systems.

DATA CURATION

Curation for BioGRID is performed by a dedicated team of PhD-level curators. A web-based interaction management system (IMS) is used to build prioritized publication queues for different projects and facilitate the curation process through structured pull-down menus. The history of all curated data is tracked to each individual curator. Curators also help guide direct deposition by authors, which is particularly useful for pre-publication annotation of large-scale datasets and allows immediate public release of the data upon publication.

Within the past 2 years, BioGRID curators have begun to use text-mining tools to prioritize the relevant literature for each curation project (18). In turn, BioGRID supports the text-mining community by providing a gold-standard collection of manually curated interactions for the BioCreative challenge (19–22), a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. We have also established collaborations with WormBase (23) and the development team for the Textpresso text-mining tool (24). For example, the curation queue for the Wnt-signaling network is prioritized based on text-mining results by Textpresso support vector machine (SVM) analyses, and ‘Textpresso for Wnt’ has also been set up as a text-mining interface to facilitate our curation. The overall curation pipeline of BioGRID is illustrated in Figure 1.

Figure 1.

BioGRID curation pipeline. The curation workflow consists of three major steps: (i) triage of the literature of interest by text-mining tools and/or interaction-directed PubMed queries; (ii) curation, annotation and tracking of interaction data through the web-based IMS and (iii) monthly public release of interaction data records.

BioGRID actively collaborates with the extensive MOD community on different aspects of curation. For example, in collaboration with SGD, BioGRID curators have used the Yeast Phenotype Ontology (YPO) developed at SGD to assign structured phenotypes to over 200 000 budding and fission yeast genetic interactions. Collaborations are also underway with WormBase (23), ZFIN (25), FlyBase (26), MGI (27) and CGD (28) to coordinate interaction curation, and thereby leverage expertise and in-house MOD data that are relevant to biological interactions. For example, GO evidence codes generated by the MODs are often derived from publications that are likely to contain interaction data. Collaborations with the MODs have also led to an improved curation approach for higher organisms by implementation of species-specific phenotype ontologies and the broadening of interaction terms to capture more complex genetic interaction data. The different biology of various organisms used in biomedical research presents a formidable challenge in the annotation and interpretation of genetic interactions, and in the reconciliation of structured phenotypes across all species. In order to meet this challenge, in conjunction with WormBase (23), and supported by other MODs such as SGD (13), CGD (28), PomBase (14), FlyBase (26), TAIR (15) and ZFIN (25), we have developed a universal genetic interaction (GI) ontology that enables the annotation of more complex phenotypic outcomes associated with genetic interactions from higher organisms. The genetic interaction ontology has been submitted to the PSI-MI editorial committee (29) and will be made publicly available with the next official PSI-MI ontology release.

DATABASE ARCHITECTURE AND USER INTERFACE

In order to ensure consistent reliability and accessibility of the BioGRID web interface, we have migrated the BioGRID to a cloud-based server system with a third party provider that provides up-to-date hardware, facile operating system upgrades and improved fault tolerance. BioGRID 3.2 supports ∼28 million systematic names, aliases, official symbols and external identifiers from Ensembl (30), UniprotKB (31), NCBI Entrez-Gene (32), Genbank (33), SGD (13), WormBase (23), FlyBase (26), HGNC (34), MGD (27), TAIR (15), VectorBase (35), BeeBase (36), ZFIN (25) and HPRD (37), among other sources. BioGRID currently also supports annotation for more than 85 organisms and contains interaction data for more than 30 different species. The BioGRID web service (webservice.thebiogrid.org) has been completely redesigned to run off the new decentralized database architecture for better access and maintenance by developers. The new web service will facilitate the incorporation of BioGRID data in other databases and applications. Additional new documentation in the BioGRID wiki provides comprehensive instructions for this resource.

The BioGRID 3.2 web interface has been upgraded to include an integrated post-translational modification (PTM) viewer. This viewer highlights PTM sites on protein sequences and incorporates much of the functionality available in PhosphoGRID (Figure 2). PTM sites are colored within protein sequences according to the modification type, with clickable functions that display details such as publications, evidence codes and enzymatic relationships. The BioGRID currently supports both phosphorylation and ubiquitination sites and will expand to cover other major PTMs across all supported species.

Figure 2.

PTM display features. (A) Button to reveal PTM sites. (B) Statistics for different types of PTMs. (C) Pop-up with links to publications that document PTM evidence and relationships. (D) Tabular view of PTM site locations and links to publications. (E) Tabular view of PTM relationships and links to publications. (F) Custom gene tags.

To facilitate exploration of the biological datasets in BioGRID, we have developed a new gene tag feature for specific annotation, including membership in network-specific cohorts, gene functions or detailed attributes such as PTM site information. These gene tags can be used to build customized datasets for downloads and to define criteria for building and maintaining project-specific datasets, as for example defined by themed curation projects. These datasets may be maintained in concert with monthly BioGRID updates and are subject to strict version controls that allow reference to specific builds for data analysis. Project-specific datasets—such as for the CM and UPS datasets—will be accessed through custom gateways within the BioGRID that encompass genes, interactions, publications and biological context for the project.

Graphical network representation provides an intuitive summary of an interaction query dataset and, when appropriately configured with a dynamic interface, can be used to inspect and further query a network of interest. However, a drawback of current network visualization software is that the graphical output becomes cluttered as network complexity increases. To address this issue, we have developed a new dynamic BioGRID interaction viewer that is based on a simple visual layout and which has user-friendly filters (Figure 3). In the BioGRID viewer, all interaction nodes are distributed in a circular layout with the query gene in the center. The properties of individual interactions are visualized by moving the tooltip over the interactor of interest to highlight gene information, including species type, gene acronym, gene identifier and number of interactions in BioGRID. To facilitate retrieval of data types of interest, the viewer provides the user with a check-box filter to reduce the complexity of the graph. The user can thus choose to view only those interactions supported by particular types of experimental evidence, low- versus high-throughput data or species-specific data. The network can also be extended to include all known interactions between interactors for the query. Results of the filtered and/or extended query are downloadable in tab2 format with a single click. The BioGRID viewer is based on an open source widget, downloadable from GitHub (https://github.com), and is embeddable in any web page. Network images in the viewer are rendered from interactions retrieved from the BioGRID REST service. The viewer is implemented using the d3js (http://d3js.org/) library and requires a browser that supports JavaScript and SVG, which includes modern browsers such as Chrome, Firefox and Safari.

Figure 3.

The BioGRID viewer. (A) In the example shown the viewer returns all the interactions of the query gene/protein and interactions between the hit genes/protein. Properties of individual interactions are revealed by hovering over the interaction node of interest. (B) Network view can be simplified by several available filters. In this example, all genetic interactions have been filtered out. See text for further details.

DATA ACCESS AND DISTRIBUTION

BioGRID datasets are updated and archived every month and can be freely accessed through widely used community resources over the internet and a number of dedicated bioinformatic tools. Records are now available interactively through the BioGRID web search page for download in a variety of XML (PSI-MI 1.0, PSI-MI 2.5) and tabular (tab, tab2 and mitab) formats and are also available through NCBI Entrez-Gene (32), DroID (38) and GermOnline (39), through several major MODs such as FlyBase (26), TAIR (15), SGD (13) and PomBase (14), and through meta-databases such as STRING (40), iRefIndex (41) and Pathway Commons (42). BioGRID datasets can also be directly interrogated through network visualization and analysis suites, including the original Osprey viewer (43), Cytoscape (44) and GeneMANIA (45). Notably, BioGRID data have recently been dynamically integrated into the ProHits LIMS system (46) in order to allow real-time comparison of experimental mass spectrometry data to published data housed in BioGRID.

In 2012, Google Analytics reported that the BioGRID received on average 69 237 page views and 10 110 unique visitors per month, versus 64 298 page views and 9928 unique visitors per month in 2011. BioGRID data files were downloaded on average 6900 times per month, compared with 6400 downloads per month in 2011. These statistics do not include the widespread dissemination by the various partner websites listed above that host BioGRID interaction data. The BioGRID user base is located primarily in the USA (37%), followed by UK (8%), Germany (7%), Canada (6%), Japan (6%), China (5%), France (4%), India (4%), Spain (2%) and all other countries (25%).

In order to facilitate the access and interoperability of BioGRID data with multiple platforms, we recently developed a BioGRID representational state transfer (REST) service and a BioGRID plug-in for the widely used Cytoscape visualization system (47). The BioGRID REST service grants full URL-based access to the BioGRID data and also provides the user with specific parameters to filter the data by various attributes. For example, the REST service drives a related tool called BioGRID Webgraph that generates network representations from user-provided gene lists. The dedicated Cytoscape plug-in acts as a web service client that provides facile import and filtering of the full BioGRID dataset for visualization and analysis in Cytoscape (44).

FUTURE DEVELOPMENTS

The BioGRID will continue to provide the biomedical and biological research communities with up-to-date, high-quality and extensively annotated protein and genetic interaction data, along with the requisite software tools to search, visualize and analyze interaction datasets. BioGRID will also continue to participate in the IMEx consortium of interaction databases (48). In addition to ongoing curation of interactions for the major model organism species, we will expand species coverage in order to facilitate interolog analyses, in particular to enable comparison of interaction networks across model organism species and humans. We have recently initiated the systematic annotation of protein and genetic interactions for Candida albicans, which is an important emerging model system and a prevalent human pathogen. We have also initiated the annotation of the human HIV1 interactome in the context of the Linking Animal Models to Human Disease Initiative (see http://www.lamhdi.org). These and other nascent projects will be facilitated by the development of more efficient text-mining tools through collaborations with Textpresso and others. This cross-species and themed approach to curation will enable new insights into human biology and disease by integration of interaction data from multiple model organism systems.

FUNDING

The National Institutes of Health National Center for Research Resources [R01RR024031 and R24RR032659 to M.T. and K.D.]; the Biotechnology and Biological Sciences Research Council [BB/F010486/1 to M.T.]; the Canadian Institutes of Health Research [FRN 82940 to M.T.]; the European Commission FP7 Program [2007-223411 to M.T.] and a Genome Québec International Recruitment Award and a Canada Research Chair in Systems and Synthetic Biology [to M.T.]. Funding for open access charge: The National Institute of Health.

Conflict of interest statement. None declared.

ACKNOWLEDGEMENTS

The authors thank Mike Cherry, Paul Sternberg, Bill Gelbart, David Botstein, Henning Hermjakob, Shoshana Wodak, Anne-Claude Gingras, Gary Bader, Chris Sander, Val Wood, Gavin Sherlock, Ivan Sadowski, Lincoln Stein, Judy Blake, Monty Westerfield, Maryann Martone, Mark Ellisman and Olga Troyanskaya for many helpful discussions. Particularly thanks are due to Chris Grove and other colleagues at WormBase for ongoing collaborative development of the genetic interaction ontology.

REFERENCES

1
Rozenblatt-Rosen
O
Deo
RC
Padi
M
Adelmant
G
Calderwood
MA
Rolland
T
Grace
M
Dricot
A
Askenazi
M
Tavares
M
, et al. 
Interpreting cancer genomes using systematic host network perturbations by tumour virus proteins
Nature
2012
, vol. 
487
 (pg. 
491
-
495
)
2
Ryan
O
Shapiro
RS
Kurat
CF
Mayhew
D
Baryshnikova
A
Chin
B
Lin
ZY
Cox
MJ
Vizeacoumar
F
Cheung
D
, et al. 
Global gene deletion analysis exploring yeast filamentous growth
Science
2012
, vol. 
337
 (pg. 
1353
-
1356
)
3
Jager
S
Cimermancic
P
Gulbahce
N
Johnson
JR
McGovern
KE
Clarke
SC
Shales
M
Mercenne
G
Pache
L
Li
K
, et al. 
Global landscape of HIV-human protein complexes
Nature
2012
, vol. 
481
 (pg. 
365
-
370
)
4
Breitkreutz
A
Choi
H
Sharom
JR
Boucher
L
Neduva
V
Larsen
B
Lin
ZY
Breitkreutz
BJ
Stark
C
Liu
G
, et al. 
A global protein kinase and phosphatase interaction network in yeast
Science
2010
, vol. 
328
 (pg. 
1043
-
1046
)
5
Babu
M
Vlasblom
J
Pu
S
Guo
X
Graham
C
Bean
BD
Burston
HE
Vizeacoumar
FJ
Snider
J
Phanse
S
, et al. 
Interaction landscape of membrane-protein complexes in Saccharomyces cerevisiae
Nature
2012
, vol. 
489
 (pg. 
585
-
589
)
6
Havugimana
PC
Hart
GT
Nepusz
T
Yang
H
Turinsky
AL
Li
Z
Wang
PI
Boutz
DR
Fong
V
Phanse
S
, et al. 
A census of human soluble protein complexes
Cell
2012
, vol. 
150
 (pg. 
1068
-
1081
)
7
LaCount
DJ
Vignali
M
Chettier
R
Phansalkar
A
Bell
R
Hesselberth
JR
Schoenfeld
LW
Ota
I
Sahasrabudhe
S
Kurschner
C
, et al. 
A protein interaction network of the malaria parasite Plasmodium falciparum
Nature
2005
, vol. 
438
 (pg. 
103
-
107
)
8
Bensimon
A
Heck
AJ
Aebersold
R
Mass spectrometry-based proteomics and network biology
Annu. Rev. Biochem.
2012
, vol. 
81
 (pg. 
379
-
405
)
9
Yu
H
Tardivo
L
Tam
S
Weiner
E
Gebreab
F
Fan
C
Svrzikapa
N
Hirozane-Kishikawa
T
Rietman
E
Yang
X
, et al. 
Next-generation sequencing to generate interactome datasets
Nat. Methods
2011
, vol. 
8
 (pg. 
478
-
480
)
10
Howe
D
Costanzo
M
Fey
P
Gojobori
T
Hannick
L
Hide
W
Hill
DP
Kania
R
Schaeffer
M
St Pierre
S
, et al. 
Big data: the future of biocuration
Nature
2008
, vol. 
455
 (pg. 
47
-
50
)
11
Ideker
T
Dutkowski
J
Hood
L
Boosting signal-to-noise in complex biology: prior knowledge is power
Cell
2011
, vol. 
144
 (pg. 
860
-
863
)
12
Stark
C
Breitkreutz
BJ
Chatr-aryamontri
A
Boucher
L
Oughtred
R
Livstone
MS
Nixon
J
Van Auken
K
Wang
X
Shi
X
, et al. 
The BioGRID interaction database: 2011 update
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D698
-
D704
)
13
Cherry
JM
Hong
EL
Amundsen
C
Balakrishnan
R
Binkley
G
Chan
ET
Christie
KR
Costanzo
MC
Dwight
SS
Engel
SR
, et al. 
Saccharomyces Genome Database: the genomics resource of budding yeast
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D700
-
D705
)
14
Wood
V
Harris
MA
McDowall
MD
Rutherford
K
Vaughan
BW
Staines
DM
Aslett
M
Lock
A
Bahler
J
Kersey
PJ
, et al. 
PomBase: a comprehensive online resource for fission yeast
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D695
-
D699
)
15
Lamesch
P
Berardini
TZ
Li
D
Swarbreck
D
Wilks
C
Sasidharan
R
Muller
R
Dreher
K
Alexander
DL
Garcia-Hernandez
M
, et al. 
The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D1202
-
D1210
)
16
Turinsky
AL
Turner
B
Borja
RC
Gleeson
JA
Heath
M
Pu
S
Switzer
T
Dong
D
Gong
Y
On
T
, et al. 
DAnCER: disease-annotated chromatin epigenetics resource
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D889
-
D894
)
17
Varshavsky
A
The ubiquitin system, an immense realm
Annu. Rev. Biochem.
2012
, vol. 
81
 (pg. 
167
-
176
)
18
Hirschman
L
Burns
GA
Krallinger
M
Arighi
C
Cohen
KB
Valencia
A
Wu
CH
Chatr-Aryamontri
A
Dowell
KG
Huala
E
, et al. 
Text mining for the biocuration workflow
Database
2012
, vol. 
2012
  
April 18 (doi:10.1093/database/bas020; epub ahead of print).
19
Krallinger
M
Leitner
F
Vazquez
M
Salgado
D
Marcelle
C
Tyers
M
Valencia
A
Chatr-aryamontri
A
How to link ontologies and protein-protein interactions to literature: text-mining approaches and the BioCreative experience
Database
2012
, vol. 
2012
  
March 21 (doi:10.1093/database/bas017; epub ahead of print)
20
Arighi
CN
Roberts
PM
Agarwal
S
Bhattacharya
S
Cesareni
G
Chatr-Aryamontri
A
Clematide
S
Gaudet
P
Giglio
MG
Harrow
I
, et al. 
BioCreative III interactive task: an overview
BMC Bioinformatics
2011
, vol. 
12
 
Suppl. 8
pg. 
S4
 
21
Krallinger
M
Vazquez
M
Leitner
F
Salgado
D
Chatr-Aryamontri
A
Winter
A
Perfetto
L
Briganti
L
Licata
L
Iannuccelli
M
, et al. 
The protein-protein interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
BMC Bioinformatics
2011
, vol. 
12
 
Suppl. 8
pg. 
S3
 
22
Chatr-Aryamontri
A
Winter
A
Perfetto
L
Briganti
L
Licata
L
Iannuccelli
M
Castagnoli
L
Cesareni
G
Tyers
M
Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases
BMC Bioinformatics
2011
, vol. 
12
 
Suppl. 8
pg. 
S8
 
23
Yook
K
Harris
TW
Bieri
T
Cabunoc
A
Chan
J
Chen
WJ
Davis
P
de la Cruz
N
Duong
A
Fang
R
, et al. 
WormBase 2012: more genomes, more data, new website
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D735
-
D741
)
24
Muller
HM
Kenny
EE
Sternberg
PW
Textpresso: an ontology-based information retrieval and extraction system for biological literature
PLoS Biol.
2004
, vol. 
2
 pg. 
e309
 
25
Bradford
Y
Conlin
T
Dunn
N
Fashena
D
Frazer
K
Howe
DG
Knight
J
Mani
P
Martin
R
Moxon
SA
, et al. 
ZFIN: enhancements and updates to the Zebrafish Model Organism Database
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D822
-
D829
)
26
McQuilton
P
St Pierre
SE
Thurmond
J
FlyBase 101–the basics of navigating FlyBase
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D706
-
D714
)
27
Eppig
JT
Blake
JA
Bult
CJ
Kadin
JA
Richardson
JE
The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D881
-
D886
)
28
Inglis
DO
Arnaud
MB
Binkley
J
Shah
P
Skrzypek
MS
Wymore
F
Binkley
G
Miyasato
SR
Simison
M
Sherlock
G
The Candida genome database incorporates multiple Candida species: multispecies search and analysis tools with curated gene and protein information for Candida albicans and Candida glabrata
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D667
-
D674
)
29
Kerrien
S
Orchard
S
Montecchi-Palazzi
L
Aranda
B
Quinn
AF
Vinod
N
Bader
GD
Xenarios
I
Wojcik
J
Sherman
D
, et al. 
Broadening the horizon–level 2.5 of the HUPO-PSI format for molecular interactions
BMC Biol.
2007
, vol. 
5
 pg. 
44
 
30
Flicek
P
Amode
MR
Barrell
D
Beal
K
Brent
S
Carvalho-Silva
D
Clapham
P
Coates
G
Fairley
S
Fitzgerald
S
, et al. 
Ensembl 2012
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D84
-
D90
)
31
UniProtConsortium
Reorganizing the protein space at the Universal Protein Resource (UniProt)
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D71
-
D75
)
32
Sayers
EW
Barrett
T
Benson
DA
Bolton
E
Bryant
SH
Canese
K
Chetvernin
V
Church
DM
Dicuccio
M
Federhen
S
, et al. 
Database resources of the National Center for Biotechnology Information
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D13
-
D25
)
33
Benson
DA
Karsch-Mizrachi
I
Lipman
DJ
Ostell
J
Sayers
EW
GenBank
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D32
-
D37
)
34
Seal
RL
Gordon
SM
Lush
MJ
Wright
MW
Bruford
EA
genenames.org: the HGNC resources in 2011
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D514
-
D519
)
35
Lawson
D
Arensburger
P
Atkinson
P
Besansky
NJ
Bruggner
RV
Butler
R
Campbell
KS
Christophides
GK
Christley
S
Dialynas
E
, et al. 
VectorBase: a data resource for invertebrate vector genomics
Nucleic Acids Res.
2009
, vol. 
37
 (pg. 
D583
-
D587
)
36
Munoz-Torres
MC
Reese
JT
Childers
CP
Bennett
AK
Sundaram
JP
Childs
KL
Anzola
JM
Milshina
N
Elsik
CG
Hymenoptera Genome Database: integrated community resources for insect species of the order Hymenoptera
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D658
-
D662
)
37
Keshava Prasad
TS
Goel
R
Kandasamy
K
Keerthikumar
S
Kumar
S
Mathivanan
S
Telikicherla
D
Raju
R
Shafreen
B
Venugopal
A
, et al. 
Human Protein Reference Database—2009 update
Nucleic Acids Res.
2009
, vol. 
37
 (pg. 
D767
-
D772
)
38
Murali
T
Pacifico
S
Yu
J
Guest
S
Roberts
GG
3rd
Finley
RL
Jr
DroID 2011: a comprehensive, integrated resource for protein, transcription factor, RNA and gene interactions for Drosophila
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D736
-
D743
)
39
Lardenois
A
Gattiker
A
Collin
O
Chalmel
F
Primig
M
GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle
Database
2010
, vol. 
2010
  
Dec 10 (doi:10.1093/database/baq030; epub ahead of print)
40
Szklarczyk
D
Franceschini
A
Kuhn
M
Simonovic
M
Roth
A
Minguez
P
Doerks
T
Stark
M
Muller
J
Bork
P
, et al. 
The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D561
-
D568
)
41
Razick
S
Magklaras
G
Donaldson
IM
iRefIndex: a consolidated protein interaction database with provenance
BMC Bioinformatics
2008
, vol. 
9
 pg. 
405
 
42
Cerami
EG
Gross
BE
Demir
E
Rodchenkov
I
Babur
O
Anwar
N
Schultz
N
Bader
GD
Sander
C
Pathway Commons, a web resource for biological pathway data
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D685
-
D690
)
43
Breitkreutz
BJ
Stark
C
Tyers
M
Osprey: a network visualization system
Genome Biol.
2003
, vol. 
4
 pg. 
R22
 
44
Smoot
ME
Ono
K
Ruscheinski
J
Wang
PL
Ideker
T
Cytoscape 2.8: new features for data integration and network visualization
Bioinformatics
2011
, vol. 
27
 (pg. 
431
-
432
)
45
Warde-Farley
D
Donaldson
SL
Comes
O
Zuberi
K
Badrawi
R
Chao
P
Franz
M
Grouios
C
Kazi
F
Lopes
CT
, et al. 
The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function
Nucleic Acids Res.
2010
, vol. 
38
 (pg. 
W214
-
W220
)
46
Liu
G
Zhang
J
Larsen
B
Stark
C
Breitkreutz
A
Lin
ZY
Breitkreutz
BJ
Ding
Y
Colwill
K
Pasculescu
A
, et al. 
ProHits: integrated software for mass spectrometry-based interaction proteomics
Nat Biotechnol
2010
, vol. 
28
 (pg. 
1015
-
1017
)
47
Winter
AG
Wildenhain
J
Tyers
M
BioGRID REST Service, BiogridPlugin2 and BioGRID WebGraph: new tools for access to interaction data at BioGRID
Bioinformatics
2011
, vol. 
27
 (pg. 
1043
-
1044
)
48
Orchard
T
Kerrien
S
Abbani
S
Aranda
B
Bhate
J
Bidwell
S
Bridge
A
Briganti
L
Brinkman
FSL
Cesareni
G
, et al. 
Protein interaction data curation: the International Molecular Exchange (IMEx) consortium
Nat. Methods
2012
, vol. 
9
 (pg. 
345
-
350
)

Author notes

The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com.

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.