| IB 67110 (2 credits) |
Course home page: http://gel.ym.edu.tw/~chc/ab2004.html |
| Advanced Bioinformatics |
Course organizer: Ueng-Cheng Yang (楊永正) |
| Lecture room: 傳醫乙棟 room 713 |
Office: 傳醫乙棟712, Tel: (02) 28267128, Email: yang@ym.edu.tw |
| Thursdays, 10:10am-12:00pm |
Course organizer: Chuan-Hsiung Chang (張傳雄) |
| Office hours by appointment |
Office: IG room 555, Tel: (02) 28267316, Email: cchang@ym.edu.tw |
An introduction to the fundamental theory and practice of bioinformatics.
Goals for the course: The course will familiarize students with the
fundamental principles of bioinformatics. By the end of the course, students
will
have a working knowledge of a variety of topics important in bioinformatics, and
a grasp of the underlying principles that is adequate for them to
evaluate and use new bioinformatics methods as they arise in the future.
Course Requirements:
Prerequisites: A familiarity with the
concepts of bioinformatics.
Although there are no formal course prerequisites, you should have at
least one prior upper division undergraduate course (or better) in
bioinformatics skills.
For this course, students will be required to
read in advance several particular articles. Additional reading material may
also be
assigned by each instructor.
Reference books:
(1)
“Algorithms on Strings, Trees,
and Sequences: Computer Science and Computational Biology”. Dan Gusfield,
Cambridge University Press.
(2) “Introduction to
Computational Molecular Biology”. Setubal, Meidanis, PWS Publishing.
(3)“Biological Sequence Analysis: Probabilistic models of proteins and
nucleic acids”. Durbin, Eddy, Krogh, Mitchison. Cambridge University
Press.
The assigned reading materials will be available in Acrobat (pdf) format.
|
授課對象:生物資訊研究所博士班 負責教師姓名:楊永正、張傳雄 時 間:每週四上午10:10-12:00 聯 絡 電 話 :(02) 28267128 地 點:傳醫大樓乙棟七樓713室 張傳雄教師: (02) 28267316 |
||||||
|
週次 |
日期 |
討論主題 |
時數 |
教師姓名 |
Reading Assignments |
|
|
1
2 3 4 5 6
8 9 10 11 12 13 14 15 16
17 |
02/23
03/02 03/09 03/16 03/23 03/30
04/13 04/20 04/27 05/04 05/11 05/18 05/25 06/01 06/08 06/15 06/22 |
Course Introduction & Overview First Section Topic: * Sequence Comparison * Dot Plot Dynamic Programming (Global & Local)
Scoring Matrices Multiple Sequence Alignment
Second Section Topic: *
Pathway Analysis * Miderterm Report-1 Miderterm Report-2 Miderterm Report-3 TBA TBA TBA TBA TBA Final Report |
2
2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 |
張傳雄
張傳雄 張傳雄 張傳雄 張傳雄 張傳雄
楊永正 楊永正 張傳雄 張傳雄 張傳雄 張傳雄 楊永正 楊永正 楊永正 楊永正 楊永正 楊永正 |
Articles # 01-04 Articles # 05-13 Articles # 14-17 Articles # 18-29 Articles # 30-37
Articles # 38-50
|
|
Midterm: There will be an in-class Presentation Report on May 11, 2006 and it will be worth fifty points.
Final: There will be an in-class Presentation Report on June 22, 2006, and it will be worth fifty points.
Dot Plot
1. Gibbs, A. J. & McIntyre, G. A. (1970).
The diagram, a method for comparing sequences. its use with amino
acid and nucleotide sequences.
Eur. J. Biochem. 16, 1-11.
[pdf]
2. Sonnhammer EL, Durbin R. (1995).
A dot-matrix program with dynamic threshold control suited for
genomic DNA and protein sequence analysis.
Gene. 167, GC1-10.
[pdf]
3. Junier T, Pagni M. (2000).
Dotlet: diagonal plots in a web browser.
Bioinformatics. 16, 178-179.
[pdf]
4. Huang Y, Zhang L. (2004).
Rapid and sensitive dot-matrix methods for genome analysis.
Bioinformatics. 20, 460-466.
[pdf]
Dynamic Programming
5. Eddy SR. (2004).
What is dynamic programming?
Nat Biotechnol. 22, 909-910.
[pdf]
Global Alignment:
6. Needleman, S. B. & Wunsch, C. D. (1970).
A general method applicable to the search for similarities in the
amino acid sequence of two proteins.
J. Mol. Biol. 48, 443-453.
[pdf]
Local Alignment:
7. Smith, T. F. & Waterman, M. S. (1981).
Identification of common molecular subsequences.
J. Mol. Biol. 147, 195-197.
[pdf]
8. Waterman, M. S. (1983).
Sequence alignments in the neighborhood of the optimum with general
application to dynamic programming.
Proc. Natl. Acad. Sci. USA,
80, 3123-3124.
[pdf]
Statistical Significance of Sequence Alignments & others
9. Gotoh, O. (1982).
An improved algorithm for matching biological sequences.
J. Mol. Biol. 162, 705-708.
[pdf]
10. Waterman, M. (1994).
Estimating statistical of sequence alignments.
Phil.Trans.R.Soc.Lond B 344, 383-390.
[pdf]
11. Altschul, S.F. & Gish, W. (1996).
Local alignment statistics.
Meth. Enzymol. 266, 460-480.
[pdf]
12. Pearson, W. R. (1998).
Empirical statistical estimates for sequence similarity
searches.
J Mol Biol. 276, 71-84.
[pdf]
13. W. R. Pearson and T. C. Wood. (2001).
"Statistical significance in biological sequence comparison"
in Handbook of Statistical Genetics, D. J. Balding, M. Bishop, and
C. Cannings eds.
London: Wiley, pp. 39-65.
[pdf]
Substitution Matrices
14. Henikoff, S. & Henikoff, J. G. (1993).
Performance evaluation of amino acid substitution matrices.
Proteins: Struct., Funct., Genet. 17, 49-61.
[pdf]
PAM:
15. Dayhoff, M. O., Schwartz, R. M. & Orcutt, B. C. (1978).
A model of evolutionary change in proteins. [pdf]
Matrices for detecting distant relationships. [pdf]
In Atlas of protein sequence and structure, (Dayhoff, M.
O., ed.), 5, 345-358.
National biomedical research foundation Washington DC.
BLOSUM:
16. Henikoff, S. & Henikoff, J. G. (1992).
Amino acid substitution matrices from protein blocks.
Proc. Natl. Acad. Sci. USA, 89, 10915-10919.
[pdf]
17. Eddy SR. (2004).
Where did the BLOSUM62 alignment score matrix come from?
Nat Biotechnol. 22, 1035-1036.
[pdf]
FASTA
18.
Lipman DJ, Pearson WR. (1985).
Rapid and sensitive protein similarity searches.
Science 227, 1435-1441.
[pdf]
19. Pearson WR, Lipman DJ. (1988).
Improved tools for biological sequence comparison.
Proc Natl Acad Sci U S A. 85, 2444-2448.
[pdf]
20. Pearson WR.
(1995).
Comparison of methods for searching protein sequence
databases.
Protein Sci. 4, 1145-1160.
[pdf]
21. Pearson WR.
(1997).
Identifying distantly related protein sequences.
Comput. Appl. Biosci. 13, 325-332.
[pdf]
22. Pearson WR. (2000).
Flexible sequence
similarity searching with the FASTA3 program package.
Methods Mol Biol. 132, 185-219.
[pdf]
23. Smoot ME, Guerlain SA, Pearson WR.
(2004).
Visualization of near-optimal alignments.
Bioinformatics. 20,
953-958.
[pdf]
BLAST
24. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D.
J. (1990).
Basic local alignment search tool.
J. Mol. Biol. 215, 403-410.
[pdf]
25. Shpaer EG, Robinson M, Yee D, Candlin JD, Mines R,
Hunkapiller T. (1996).
Sensitivity and selectivity in protein similarity searches:
a comparison of Smith-Waterman in hardware to BLAST and FASTA.
Genomics. 38, 179-191.
[pdf]
26. Altschul, S.; Madden, T.; Schaffer, A.; Zhang, J.;
Zhang, Z.; Miller, W.; and Lipman, D. (1997).
Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs.
Nucleic Acids Res. 25, 3389-3402.
[pdf]
27. Altschul, S.F. & Koonin, E.V. (1998).
Iterated profile searches with PSI-BLAST - a tool for discovery in
protein databases.
Trends Biochem. Sci. 23, 444-447.
[pdf]
28. Zhang Z, Schaffer AA, Miller W, Madden TL, Lipman DJ, Koonin EV,
Altschul SF. (1998).
Protein sequence similarity searches using patterns as
seeds. (PHI-BLAST)
Nucleic Acids Res. 26, 3986-3990.
[pdf]
29. A. Pearson, H. Peri, O. Jabado, O. Wood. (2000).
eBLAST - Building a Better BLAST.
[pdf]
Profile Analysis
30. Gribskov, M., McLachlan, A. D. & Eisenberg, D. (1987).
Profile analysis: Detection of distantly related proteins.
Proc. Natl. Acad. Sci. USA, 84, 4355-4358.
[pdf]
Multiple Sequence Alignment
31. Lipman, D. J., Altschul, S. F. & Kececioglu, J. (1989).
A tool for multiple sequence alignment.
Proc. Natl. Acad. Sci. USA, 86, 4412-4415.
[pdf]
32.
Thompson JD,
Higgins DG, Gibson TJ. (1994).
CLUSTAL W: improving the sensitivity of progressive multiple
sequence alignment through sequence weighting, position-specific
gap penalties and weight matrix choice.
Nucleic Acids Res. 22, 4673-4680.
[pdf]
33. Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG,
Thompson JD.
(2003).
Multiple sequence alignment with the
Clustal series of programs.
Nucleic Acids Res. 31, 3497-3500.
[pdf]
34. Brudno M, Do CB, Cooper GM, Kim MF, Davydov E, Green ED,
Sidow A, Batzoglou S; NISC Comparative Sequencing Program.
(2003).
LAGAN and Multi-LAGAN: efficient tools for large-scale
multiple alignment of genomic DNA.
Genome Res. 13, 721-731.
[pdf]
35. Sammeth M, Morgenstern B, Stoye J.
(2003).
Divide-and-conquer multiple alignment with segment-based
constraints.
Bioinformatics. 19 Suppl 2,
II189-II195.
[pdf]
36. Edgar RC.
(2004).
MUSCLE: multiple sequence alignment with high accuracy and
high throughput.
Nucleic Acids Res. 32, 1792-1797.
[pdf]
37. Edgar RC.
(2004).
MUSCLE: a multiple sequence alignment method with reduced
time and space complexity.
BMC Bioinformatics. 5, 113-121.
[pdf]
Suffix Tree &
MUMmer
38. Delcher AL, Kasif S, Fleischmann RD, Peterson J, White O, Salzberg
SL. (1999).
Alignment of whole genomes.
Nucleic
Acids Res.
27,
2369-2376.
[pdf]
39. Delcher AL, Phillippy A, Carlton J, Salzberg SL. (2002).
Fast algorithms for large-scale genome alignment and comparison.
Nucleic
Acids Res. 30, 2478-2483.
[pdf]
40. Kurtz S, Phillippy A, Delcher AL, Smoot M,
Shumway M, Antonescu C, Salzberg SL. (2004).
Versatile and open software for comparing large genomes.
Genome Biol. 5, R12.
[pdf]
41. Wong PW, Lam TW, Lu N, Ting HF, Yiu SM. (2004).
An efficient algorithm for optimizing whole genome alignment with
noise.
Bioinformatics. [Epub ahead of print].
[pdf]
PatternHunter, PipMaker, zPicture, VISTA, MAVID, &
Mauve
42. Ma B, Tromp J, Li M. (2002).
PatternHunter: faster and more sensitive homology search.
Bioinformatics. 18, 440-445.
[pdf]
43. Li M, Ma B, Kisman D, Tromp J.
(2004).
PatternHunter II: highly sensitive and fast homology search.
J Bioinform Comput Biol. 2, 417-439.
[pdf]
44. Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R,
Hardison R, Miller W.
(2000).
PipMaker--a web server for aligning two
genomic DNA sequences.
Genome Res. 10,
577-586.
[pdf]
45. Ovcharenko I, Loots GG, Hardison RC, Miller W, Stubbs L.
(2004).
zPicture: dynamic alignment and
visualization tool for analyzing conservation profiles.
Genome Res. 14,
472-477.
[pdf]
46. Shah N, Couronne O, Pennacchio LA, Brudno M, Batzoglou S,
Bethel EW, Rubin EM, Hamann B, Dubchak I.
(2004).
Phylo-VISTA: interactive visualization of multiple DNA
sequence alignments.
Bioinformatics. 20, 636-643.
[pdf]
47. Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I.
(2004).
VISTA: computational tools for
comparative genomics.
Nucleic Acids Res.
32(Web Server issue), W273-279.
[pdf]
48. Bray N, Pachter L.
(2003).
MAVID multiple alignment server.
Nucleic Acids Res. 2003 Jul 1;31(13):3525-3526.
[pdf]
49. Bray N, Pachter L.
(2004).
MAVID: constrained ancestral alignment of
multiple sequences.
Genome Res. 14,
693-699.
[pdf]
50. Darling AC, Mau B, Blattner FR, Perna NT.
(2004).
Mauve: multiple alignment of conserved
genomic sequence with rearrangements.
Genome Res. 14,
1394-1403.
[pdf]
* PATHWAY ANALYSIS*
Local Resources
Key Bioinfo tools and sites