GeneRIF is a more comprehensive, current and computationally tractable source of gene-disease relationships than OMIM
Authors John D. Osborne*1, Simon Lin1, Warren A. Kibbe1, Lihua (Julie) Zhu1, Maria I. Danila, Rex L. Chisholm1
Robert H. Lurie Cancer Center, Northwestern University, Chicago, IL 60611
Christ Advocate Medical Center, 4440 95th Street, Oak Lawn, IL, 60453
Abstract

Motivation: The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases.
Methods:
We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to data mine gene-disease relationships from both the GeneRIF and OMIM databases. We utilized a comprehensive subset of UMLS structured as a directed acyclic graph (the Disease Ontology) to filter and interpret results from MMTx. The data mining methodology was validated against the Homayouni gene collection using recall and precision measurements.
Results: The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% (recall) and 98% (precision) using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching.

geneDO
Publication URL Link to the journal's website. TBA
PubMed URL TBA
Publication Citation John D. Osborne, Simon Lin, Warren A. Kibbe, Lihua (Julie) Zhu, Maria I. Danila, Rex L. Chisholm, GeneRIF is a more comprehensive, current and computationally tractable source of gene-disease relationships than OMIM, Technical Report, Bioinformatics Core, Northwestern University, 2007
Keywords

GeneRIF, OMIM, Disease Ontology, text data mining

 
Archive of the Technical Report
Files
Description File Name
PDF of the technical report
geneRIFDO16.pdf

About this webpage
Created 3-10-2007. Last updated 3-10-2007.
http://basic.northwestern.edu/publications/generifdo