Knowledge acquisition, consistency
checking and concurrency control for Gene Ontology (GO)
Bioinformatics Vol. 19 no. 2 2003
Pages 241-248
2003 Oxford University Press
ABSTRACT
Motivation: A critical
element of the computational infrastructure required for functional genomics is
a shared language for communicating biological data and knowledge. The Gene
Ontology (GO; http://www.geneontology.org) provides a taxonomy of concepts and
their attributes for annotating gene products. As GO increases in size, its
ongoing construction and maintenance becomes more challenging. In this paper,
we assess the applicability of a Knowledge Base Management System (KBMS), Protege-2000,
to the maintenance and development of GO.
Results: We
transferred GO to Protege-2000 in order to evaluate its suitability for GO. The
graphical user interface supported browsing and editing of GO. Tools for
consistency checking identified minor inconsistencies in GO and opportunities
to reduce redundancy in its representation. The Protege Axiom Language proved
useful for checking ontological consistency. The PROMPT tool allowed us to
track changes to GO. Using Protege-2000, we tested our ability to make changes
and extensions to GO to refine the semantics of attributes and classify more
concepts.
Availability:
Gene Ontology in Protege-2000 and the associated code are located at
http://smi.stanford.edu/projects/helix/gokbms/. Protege-2000 is available from
http://protege.stanford.edu.
1. Stevens,R., Goble,C. et al. (2000) Ontology-based
knowledge representation for bioinformatics. Briefings In Bioinformatics, 1, 398-V414.
2.Noy,N.F. and Musen,M.A. (2000) PROMPT: algorithm and
tool for automated ontology merging and alignment. Seventeenth National
Conference on Artificial Intelligence, Austin, TX.
3.Booch,G., Rumbaugh,J. et al. (1998) The Unified
Modeling Language User Guide. Addison-Wesley, Menlo Park, CA
4.Karp,P.D., Chaudhri,V.K. et al. (1999) A collaborative
environment for authoring large knowledge bases. Journal of Intelligent
Information Systems, 3, 155-194.
5.Karp,P.D., Riley,M. et al. (2002) The EcoCyc database.
Nucleic Acid. Res., 30, 56.
6.Noy,N.F., Sintek,M. et al. (2001) Creating Semantic Web
Contents with Protege-2000. IEEE Intelligent Systems, 16, 60-71