NSF Org: |
DBI Div Of Biological Infrastructure |
Recipient: |
|
Initial Amendment Date: | March 23, 2015 |
Latest Amendment Date: | September 21, 2017 |
Award Number: | 1458400 |
Award Instrument: | Continuing Grant |
Program Manager: |
Peter McCartney
DBI Div Of Biological Infrastructure BIO Direct For Biological Sciences |
Start Date: | April 1, 2015 |
End Date: | March 31, 2020 (Estimated) |
Total Intended Award Amount: | $1,420,341.00 |
Total Awarded Amount to Date: | $1,420,341.00 |
Funds Obligated to Date: |
FY 2016 = $473,305.00 FY 2017 = $482,900.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
220 ARCH ST OFC LEVEL2 BALTIMORE MD US 21201-1531 (410)706-3559 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
801 W. Baltimore Street Baltimore MD US 21201-1109 |
Primary Place of Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | ADVANCES IN BIO INFORMATICS |
Primary Program Source: |
01001617DB NSF RESEARCH & RELATED ACTIVIT 01001718DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): | |
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.074 |
ABSTRACT
Researchers generate biological data from many diverse methods that range from laboratory experiments to computer-based analyses. These data serve as the evidence that researchers use to make inferences and draw scientific conclusions. The process of biocuration seeks to capture these conclusions and the evidence that led to them in a standardized way so that the information is readily accessible to the entire scientific community. The most efficient way to accomplish this is to use an ontology to describe the evidence types. An ontology is a controlled vocabulary of terms where each term is carefully defined and linked to other terms by precise relationships. The Evidence Ontology (EO) is a community standard for describing types of research evidence used to support scientific conclusions in biological research. The EO is used by some of the world?s most prominent protein databases and genomic resources to capture evidence information. The goal of this project is to improve EO and promote its use by a larger community of researchers. The EO will be promoted through outreach, training, and education efforts, including workshops and internships. Broader impacts will include outreach efforts to Baltimore City Public Schools students focusing on teaching the importance of structuring information in a controlled way. Summer interns will engage in EO development and bioinformatics activities. A vast number of scientists researching a wide range of biological topics will benefit from the continued development and expanded use of the EO.
The ability to describe both evidence and assertion method (i.e. whether a human or a machine makes a statement) in a consistent and computable fashion is essential for multiple reasons. Capture of methodology is central to the scientific method and can impact evaluation of results, associating structured evidence with stored data allows for selective data queries and retrieval from even the largest databases, and structured evidence systems make automated quality control possible, which is essential for large-scale data management. Nearly 30 biological resources including protein databases, model organism databases, phenotype resources, and gene expression databases currently are using the Evidence Ontology (EO) to capture evidence information, support structured data queries, group related data, or establish quality control mechanisms. EO will be developed further to address structural issues, clarify the main axis, add logical constraints, and map EO to related resources. New evidence types will be continually added to the ontology based on the needs of the research community. A web resource will be created that includes improved visualization tools for evidence and data associated with EO terms, complete user documentation, and downloadable content. EO will also develop quality assessment methodologies to enable researchers to better evaluate evidence. Outreach, training, and education will be conducted to grow the EO user base and educate researchers and students about the value and means of capturing evidence. EO developers will present at scientific conferences, publish papers, host interns, and conduct workshops and science outreach activities. By improving EO and increasing user awareness, researchers will be better able to make the most of evidence and associated data. For more information, please visit: http://evidenceontology.org.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The Evidence and Conclusion Ontology (ECO) is a controlled vocabulary of terms that describe types of evidence. It is an important resource used in the capture of information about biological entities, a process called biocuration. ECO terms are used in the process of biocuration to capture the evidence that supports biological statements. Such statements link together different types of information, for example a molecular function can be linked to a protein indicating that the protein carries out that function. When making such statements, it is very important to capture what type of evidence supports the statement. Evidence capture supports the ability to track provenance, carry out effective data mining, and engage in quality control of biocuration processes.
ECO is a community standard for evidence capture. It is in use by more than 40 user groups involved in all aspects of biological research. Most of these user groups are themselves resources that provide bioinformatics information to many (often hundreds or thousands of) individual users. Some of the most prominent and widely-used bioinformatics resources use ECO in their annotation processes, including UniProt (a database containing sequences of, and information about, many millions of proteins), the Gene Ontology (providing vocabularies of terms to describe aspects of gene products such as function and cellular location), and multiple model organism databases (which collect and disseminate huge volumes of information associated with specific species.)
Under this funding award, significant development of ECO and its associated resources has taken place. At the start of this funding, ECO contained 581 terms, now it contains more than 1800. Generation of new terms in ECO is driven primarily by the needs of our user community. Users can request new terms through our GitHub issue tracker (https://github.com/evidenceontology). (GitHub is an online resource where software and other informatics resources are made available to users.) Terms are often developed for specific areas of biology in collaboration with other ontologies or user groups. In addition to term development, we have also engaged in production of other resources. The ECO-CollecTF corpus is a collection of sentences from published biological literature that have been linked to ECO terms. This corpus was produced in collaboration with CollecTF (http://collectf.umbc.edu/browse/home/) and consists of 2565 annotations of ECO terms to sentences, or pairs of sentences, from 84 publications. Additional ECO resources are accessible through the project website (http://evidenceontology.org/) which includes documentation on the project and its history, a browser to explore the ECO ontology, links to the ECO GitHub repository, directions on how to request new ECO terms, a place to download ECO in obo or owl formats, an Annotation Resources section with a link to the ECO-CollecTF corpus, and links to publications. ECO continuously engages in collaboration with other ontologies including, but not limited to, the Gene Ontology, the Ontology of Biomedical Investigations, the Disease Ontology, and the Ontology of Microbial Phenotypes. ECO is freely available for download from GitHub (https://github.com/evidenceontology) or the ECO website (http://evidenceontology.org/) and is released into the public domain under a CC0 1.0 Universal license.
Last Modified: 08/27/2020
Modified by: Michelle G Giglio
Please report errors in award information by writing to: awardsearch@nsf.gov.