Elsevier

Journal of Web Semantics

Volume 17, December 2012, Pages 33-43
Journal of Web Semantics

Ontology paper
FaBiO and CiTO: Ontologies for describing bibliographic resources and citations

https://doi.org/10.1016/j.websem.2012.08.001Get rights and content

Abstract

Semantic publishing is the use of Web and Semantic Web technologies to enhance the meaning of a published journal article, to facilitate its automated discovery, to enable its linking to semantically related articles, to provide access to data within the article in actionable form, and to facilitate integration of data between articles. Recently, semantic publishing has opened the possibility of a major step forward in the digital publishing world. For this to succeed, new semantic models and visualization tools are required to fully meet the specific needs of authors and publishers. In this article, we introduce the principles and architectures of two new ontologies central to the task of semantic publishing: FaBiO, the FRBR-aligned Bibliographic Ontology, an ontology for recording and publishing bibliographic records of scholarly endeavours on the Semantic Web, and CiTO, the Citation Typing Ontology, an ontology for the characterization of bibliographic citations both factually and rhetorically. We present those two models step by step, in order to emphasise their features and to stress their advantages relative to other pre-existing information models. Finally, we review the uptake of FaBiO and CiTO within the academic and publishing communities.

Introduction

Scholarly authoring and publishing are in the throes of a revolution, as the full potential of on-line publishing is explored. Yet, to date, publishers have not adopted Web standards for their work, but rather employ a variety of proprietary XML-based informational models and document type definitions (DTDs). While such independence was reasonable in the pre-web world of paper publishing, it now appears anachronistic, since publications and their metadata from different sources are incompatible, requiring hand-crafted mappings to convert from one to another. For a large community such as publishers, this lack of standard definitions that could be adopted and reused across the entire industry represents losses in terms of money, time and effort.

In contrast, modern web information management techniques employ standards such as RDF [1] and OWL 2 [2] to encode information in ways that permit computers to query metadata and integrate web-based information from multiple resources in an automated manner. Since the processes of scholarly communication are central to the practice of science, it is essential that publishers now adopt such standards to permit inference over the entire corpus of scholarly communication represented in journals, books and conference proceedings. This requires the availability of appropriate ontologies that are specially tailored to the requirements of authors, publishers and their readers. The purpose of this paper is to present two such ontologies that form key components of the semantic publishing revolution.

Semantic publishing is the use of web and semantic web technologies to enhance a published document such as a journal article so as to enrich its meaning, to facilitate its automatic discovery, to enable its linking to semantically related articles, to provide access to data within the article in actionable form, and to allow integration of data between papers [3], [4]. Semantic publishing and scholarly citation using web standards are presently two of the most interesting topics within the scientific publishing domain. Research areas in this domain include the development of:

  • semantic models (vocabularies, ontologies) that meet the requirements of scholarly authoring and publishing;

  • visualization and documentation tools that permit such ontologies to be easily understood;

  • annotation tools that allow these models to be used for enhancing documents with relevant semantic assertions;

  • new algorithms to take advantages of these semantic annotations when searching over large sets of on-line documents.

In this article, we address the first point by describing the principles and architecture of two ontologies central to the task of semantic publishing: FaBiO, the FRBR-aligned Bibliographic Ontology,1 an ontology for recording and publishing bibliographic records of scholarly endeavours on the Semantic Web, and CiTO, the Citation Typing Ontology,2 an ontology for the characterization of bibliographic citations, both factually and rhetorically. These ontologies are members of SPAR, the Semantic Publishing and Referencing Ontologies,3 a suite of orthogonal and complementary OWL 2 DL ontology modules that together permit the creation of comprehensive machine-readable RDF metadata for all aspects of semantic publishing and referencing.

The rest of this article is organised as follows: in Section 2 we introduce principles that have guided our work, and in Section 3 we briefly describe the existing ontologies and vocabularies we took into account when developing FaBiO and CiTO. These new ontologies are then presented in Sections 4 Representing bibliographic information using FaBiO, 5 Characterising citations with CiTO respectively. In Section 6 we describe the contexts in which FaBiO and CiTO are now being employed, and finally in Section 7 we sketch out some conclusions and future directions of our work.

Section snippets

Characteristics, starting point and principles

The main characteristics of this work, that mark it out as distinct from previous contributions, is the creation of two new semantic publishing and referencing ontologies of sufficient expressivity to meet the requirements of end users such as academic authors and publishers.

We have also developed two new presentation technologies, the Live OWL Documentation Environment (LODE)4 and the Graphical Framework For OWL

Related works

While wanting to re-use existing information models and vocabularies of relevance to scholarly publishing as far as possible, we had to come to terms with their limitations. In this section, we briefly introduce and comment upon these well-known vocabularies that we have considered and/or fully or partially employed within our own work.

Dublin core. Born as consequence of a conference held in Dublin, Ohio, USA in 1995 that involved both technicians (librarians, publishers, archivists) and

Representing bibliographic information using FaBiO

The need for ontologies that are sufficiently expressive for describing documents has been presented in Section 1.

The vocabularies described above are either poor in concepts or are ‘flat’, preventing their use for accurately describing the complexity of publishing reality. We will illustrate this by considering the representation of a typical bibliographic reference using first BIBO and then FRBR. We will then show how this information can be more accurately described using FaBiO. Consider the

Characterising citations with CiTO

Bibliographic citation, i.e., the act of referring from a citing entity to the cited one, is one of the most important activities of an author in the production of any bibliographic work, since the acknowledgement of sources that this activity represents stands at the very core of the scholarly enterprise. The network of citations created by combining citation information from many academic articles and books is a source of rich information for scholars, and can be used by a publisher to create

Community uptake of FaBiO and CiTO

FaBiO and CiTO are now being used or are being considered for adoption in a variety of academic and publishing environments, as described below. The adoption of these models by different communities can be ascribed, at least in part, to the minimization of the constraints applied to the ontological entities, so that the ontologies can be applied in a wide variety of situations.

Linked Open Vocabularies Dataset. The Linked Open Vocabularies Dataset (LOV),28

Conclusions

In this article we have presented FaBiO, the FRBR-aligned Bibliographic Ontology (current version 1.6), and CiTO, the Citation Typing Ontology (current version 2.3). FaBiO and CiTO are two new OWL 2 DL ontologies for describing bibliographic resources and bibliographic citations on the Semantic Web. We have introduced those two models step by step, in order to emphasise their features for describing bibliographic objects and to stress their advantages relative to other pre-existing models.

Acknowledgments

We thank Paolo Ciccarese and Tim Clark of Harvard University for their support and valuable conceptual and technical suggestions given while we were developing FaBiO and CiTO from CiTO v1.6 and were collaborating with them to harmonise these with SWAN. We are also grateful to our local colleagues, particularly Jun Zhao, Graham Klyne and Fabio Vitali, for their warm support, constructive criticisms and help given throughout these developments. Aspects of this work have been supported by the JISC

References (29)

  • P. Ciccarese et al.

    The SWAN biomedical discourse ontology

    Journal of Biomedical Informatics

    (2008)
  • J. Carroll, G. Klyne, Resource description framework (RDF): concepts and abstract syntax, W3C Recommendation, 10...
  • W3C OWL Working Group, OWL 2 web ontology language document overview, W3C Recommendation, 27 October 2009, World Wide...
  • D. Shotton

    Semantic publishing: the coming revolution in scientific journal publishing

    Learned Publishing

    (2009)
  • D. Shotton et al.

    Adventures in semantic publishing: exemplar semantic enhancements of a research article

    PLoS Computational Biology

    (2009)
  • S. Peroni, D. Shotton, F. Vitali, The Live OWL Documentation Environment: a tool for the automatic generation of...
  • D. Shotton

    CiTO, the citation typing ontology

    Journal of Biomedical Semantics

    (2010)
  • A. Rector, Modularisation of domain ontologies implemented in description logics and related formalisms including OWL,...
  • D. Shotton, C. Caton, G. Klyne, Ontologies for sharing, ontologies for use, In The Ontogenesis Knowledge Blog, 2010....
  • Dublin Core Metadata Initiative, Dublin core metadata element set, Version 1.1, DCMI Recommendation, 2010....
  • Dublin Core Metadata Initiative, DCMI metadata terms, DCMI Recommendation, 2010....
  • International Digital Enterprise Alliance, Publishing requirements for industry standard metadata specification version...
  • T. Hammond, RDF site summary 1.0 modules: PRISM, 2008. http://nurture.nature.com/rss/modules/mod_prism.html (last...
  • A. Miles, S. Bechhofer, SKOS simple knowledge organization system reference, W3C Recommendation, 18 August 2009, World...
  • Cited by (0)

    View full text