CS598 SS Probabilistic Methods for Biological Sequence Analysis (Fall 2005)

Instructor: Saurabh Sinha

| Home | Basic Information | Schedule |
| Readings | Project | Resources |

Schedule (tentative)

Crash course for project

  1. (8/26) Basic molecular biology: biological processes and sequence features, the genome and its annotation.   Powerpoint Slides
  2. (8/31) More detailed molecular biology. The importance of gene regulation. Fruitfly segmentation. Binding sites and motifs. Modules and their composition. Tandem repeats. Bioinformatics goals for this system: motif finding (supervised, ab initio), module finding (supervised, ab initio), regulatory pathways and networks. Powerpoint Slides
  3. (9/2) Module finding. Hidden Markov models (DEKM 3, BB 8.2). Single species Stubb and E-M. Stubb posterior score as a measure of binding site content. Powerpoint Slides
  4. (9/7) Evolution and tree of life. Comparative genomics. Alignments. Multi-species motif finding (alignment free). Parsimony and its probabilistic interpretation. (Footprinter, and probability calculations.) Powerpoint Slides

Sequence Alignment

  1. (9/9) Sequence similarity, dynamic programming (local alignment and the importance of expected negative score, DEKM 2.3 Smith-Waterman), affine gap penalty (DEKM 2.4). Paper presentation:
  2. (9/14) Quadratic time complexity of pairwise alignment, and the need for something faster. Seed based local alignments. Blast. Blast statistics (DEKM 2.5, 2.7): extreme value distribution.
  3. (9/16) Multiple alignment (DEKM 6.5). Algorithms -- Sum of pairs, Progressive alignment, DIALIGN. Profile HMMs (proteins) and PFAM domains.
  4. (9/21) Paper presentation:
  5. (9/23) Paper presentation:

Motif finding

  1. (9/28) Recap transcriptional regulation and binding sites. Overrepresentation of binding site motifs. Paper presentation
  2. (9/30) Weight matrix, relative entropy. Paper presentation:
  3. (10/5) Paper presentation:

Bayesian inference

  1. (10/7) Bayesian approach to local alignment statistics. Bayesian Inference (BB 2.3): priors, Gaussian, Gamma, Dirichlet, MAP & ML as the first level of Bayesian inference. Bayesian example (BB 3.1): the single die model with sequence data or with count data.
  2. (10/12) (Combined with 15.)
  3. (10/14) The Bayesian approach to motif finding: Expectation-Maximization. MEME and Stubb/PhyME. The statistical mechanics connection (BB 3.2.1 - 3.2.4, 4.4).

Module Detection

  1. (10/19) Hidden Markov Models for module detection. Paper presentation:
  2. (10/21) Paper presentation:

Evolution

  1. (10/26) Evolution models (DEKM 8.1 - 8.3), calculating likelihood of alignment, reversibility, Metropolis algorithm for phylogenetic tree construction (DEKM 8.4), evolutionary models with gaps (DEKM 8.5).
  2. (10/28) Paper presentation:
  3. (11/2) Evolutionary events - large repeat families, minisatellites, microsatellites. Gene duplications and pseudogenes. Tandem repeat detection (with statistics). Applications - sequence turnover. Implications for probabilistic sequence analysis algorithms. Powerpoint slides
  4. (11/4) Paper presentation: (by instructor) Powerpoint slides

Evolution and Motif finding

  1. (11/9) Paper presentation: Powerpoint slides

Population Genetics

  1. (11/11) (Combined with 24.)
  2. (11/16) Wright Fisher model, random drift, with selection only, with mutations only, coalescence theory. Neutral sequence.

Project presentations

  1. (11/18) 1. Rich Leduc 2. Jaebum Kim, Yoonkyong Lee, Younhee Ko.
  2. (11/30) Lecture on Stochastic Grammars.
  3. (12/2) 1. Kranthi Varala, Ajith Harish, ChulYun Kim. 2. Neelay Shah, Andra Ivan, Ben Liebald.
  4. (12/7) 1. Qiaozhu Mei, Hong Cheng. 2. Aditya Ramani.
  5. (12/9) 1. Xin He, Bin Tan, Qian Yang. 2. Tian Xia, Chen Chen.