InterPro in 2019: improving coverage, classification and access to protein sequence annotations

Nucleic Acids Res. 2019 Jan 8;47(D1):D351-D360. doi: 10.1093/nar/gky1100.

Abstract

The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Databases, Genetic
  • Databases, Protein*
  • Gene Ontology
  • Humans
  • Internet
  • Molecular Sequence Annotation*
  • Multigene Family
  • Protein Domains / genetics
  • Sequence Homology, Amino Acid
  • Software
  • User-Computer Interface