Automated Extraction of Information on Protein-Protein Interactions From the Biological LiteratureBioinformatics 17 No.2 2001 勤沛儒 Nov. 2, 2001’
SummaryBy the progress of post genomic era, the huge information of biological data are accumulated recently by an incredible speed. It will be the important topic to developing the efficient way to extract meaning information from the “Data Sea”. There is a demonstration about extraction of protein-protein interaction from the scientific literatures. Let computer to understand what a sentence means by the natural language processing (NPL) technique is a tough job because the flow is too complex. The method what authors developed has circumvented the NPL by establishing a dictionary of protein names for pattern match, focusing on a particular area of interest in a sentence and using only simple rules for information extraction. Through this efficient method, the precision rates of extracting protein-protein interaction from biological literature for E. coli and yeast are about 93.5% and 94.3%. It means this method should be applied to any species without limitation if there has a proper protein name dictionary be constructed. References 1. Brill, E. (1994) Some advances in transformation-based part of speech tagging. In Proceedings of the Twelth National Conference on Artificial Intelligence. AAAI Press 2. Porter, M.F. (1980) An algorithm for suffix stripping. Program, 14, 127-130. 3. Proux,D. et al. (1998) Detecting gene symbols and names in biological texts: a first step toward pertinent information extraction. Genome Informatics, 72-80. |
|
本網站將盡力維持最新、最正確的資訊,但疏忽之處在所難免。如有錯誤,一切均以本校書面文件為據。 聯絡電話: 呂小姐(02-28267322) |