Abstract:
Expressed
sequence tag (EST) projects have large un-annotated files which are redundant,
un-annotated, single-pass reactions, with virtually no biological
content in cDNA libraries. When processing and annotation of DNA
sequence data, we usually use commercially available software which read single
input files and produce single output files. Sometimes these software can’t
process large-scale sequence data and need user’s manual
intervention.
PipeOnline which
includes a series of script-linked programs that process multiple raw DNA
sequence files and produce a database of records approaches the
determination of metabolic and biological function from large-scale DNA
sequence data. This system also annotate the functions which have been
estimated using the functional dictionary Metabolic Pathways Database in the
input DNA sequences and complete the POL records. These POL records can be
retrieved through a series of specific queries or through a comprehensive
gene-function browser at http://stress-genomics.org.
Reference:
1. Selkov,E.J., Grechkin,Y., Mikhailova,N. and Salkov,E. (1998) MPW: the Metabolic Pathways Database. Nucleic Acids Res., 26, 43–45.
2.
Overbeek,R., Larsen,N., Pusch,G.D.,
D’Souza,M., Selkov,E.,Jr, Kyrpides,N., Fonstein,M., Maltsev,N. and Selkov,E.
(2000) WIT: integrated system for high-throughput genome sequence analysis and
metabolic reconstruction. Nucleic Acids Res., 28, 123–125.
3.
Gotoh,O. (1982) An improved algorithm
for matching biological sequences. J. Mol. Biol., 162, 705–708.
4.
Smith,T.F. and Waterman,M.S. (1981)
Identification of common molecular subsequences. J. Mol. Biol., 147,
195–197.
5.
Bairoch,A. (1999) The ENZYME data bank
in 1999. Nucleic Acids Res., 27, 310–311.
下載網址: http://nar.oupjournals.org/cgi/content/full/30/21/4761