PipeOnline 2.0: automated EST processing and functional data sorting

Abstract:

 

Expressed sequence tag (EST) projects have large un-annotated files which are redundant, un-annotated, single-pass reactions, with virtually no biological content in cDNA libraries. When processing and annotation of DNA sequence data, we usually use commercially available software which read single input files and produce single output files. Sometimes these software can’t process large-scale sequence data and need user’s manual intervention.

PipeOnline which includes a series of script-linked programs that process multiple raw DNA sequence files and produce a database of records approaches the determination of metabolic and biological function from large-scale DNA sequence data. This system also annotate the functions which have been estimated using the functional dictionary Metabolic Pathways Database in the input DNA sequences and complete the POL records. These POL records can be retrieved through a series of specific queries or through a comprehensive gene-function browser at http://stress-genomics.org.

 

Reference:

1.      Selkov,E.J., Grechkin,Y., Mikhailova,N. and Salkov,E. (1998) MPW: the Metabolic Pathways Database. Nucleic Acids Res., 26, 43–45.

2.     Overbeek,R., Larsen,N., Pusch,G.D., D’Souza,M., Selkov,E.,Jr, Kyrpides,N., Fonstein,M., Maltsev,N. and Selkov,E. (2000) WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction. Nucleic Acids Res., 28, 123–125.

3.     Gotoh,O. (1982) An improved algorithm for matching biological sequences. J. Mol. Biol., 162, 705–708.

4.     Smith,T.F. and Waterman,M.S. (1981) Identification of common molecular subsequences. J. Mol. Biol., 147, 195–197.

5.     Bairoch,A. (1999) The ENZYME data bank in 1999. Nucleic Acids Res., 27, 310–311.

下載網址: http://nar.oupjournals.org/cgi/content/full/30/21/4761