CS598 SS Probabilistic Methods for Biological Sequence Analysis (Spring 2007)

Instructor: Saurabh Sinha

| Home | Basic Information | Schedule |
| Readings | Project | Resources |

Schedule (tentative)

Crash course for project

  1. (01/16) Basic molecular biology: biological processes and sequence features, the genome and its annotation.   Powerpoint Slides
  2. (01/18) More detailed molecular biology. The importance of gene regulation. Fruitfly segmentation. Binding sites and motifs. Modules and their composition. Tandem repeats. Bioinformatics goals for this system: motif finding (supervised, ab initio), module finding (supervised, ab initio), regulatory pathways and networks. Powerpoint Slides
  3. (01/23) Module finding. Hidden Markov models (DEKM 3, BB 8.2). Expectation Maximization (DEKM 11.6). Powerpoint Slides
  4. (01/25) Evolution and tree of life. Comparative genomics. Alignments. Multi-species motif finding (alignment free). Parsimony and its probabilistic interpretation. (Footprinter, and probability calculations.) Powerpoint Slides

Sequence Alignment

  1. (01/30) Sequence similarity, dynamic programming (local alignment and the importance of expected negative score, DEKM 2.3 Smith-Waterman), affine gap penalty (DEKM 2.4). Paper presentation:
  2. (02/01) Quadratic time complexity of pairwise alignment, and the need for something faster. Seed based local alignments. Blast. Blast statistics (DEKM 2.5, 2.7): extreme value distribution.
  3. (02/06) The Bayesian take on pairwise alignment. (DEKM 2.6)
  4. (02/08) Multiple alignment (DEKM 6.5). Algorithms -- Sum of pairs, Progressive alignment, DIALIGN. Profile HMMs (proteins) and PFAM domains.
  5. (02/15) Paper presentation (by Hareesh Gadde):

Motif finding

  1. (02/20) Weight matrix, relative entropy. Paper presentation (Manish Agrawal):
  2. (02/22) Overrepresentation of binding site motifs. Paper presentation
  3. (02/27) Paper presentation (Lyndsy Kron):

Bayesian inference

  1. (03/01) Bayesian Inference (BB 2.3): priors, Gaussian, Gamma, Dirichlet, MAP & ML as the first level of Bayesian inference. Lagrange multipliers. Bayesian example (BB 3.1): the single die model with sequence data or with count data.
  2. (03/06) (Combined with 15.)
  3. (03/08) The Bayesian approach to motif finding: Expectation-Maximization. MEME and Stubb/PhyME. The statistical mechanics connection (BB 3.2.1 - 3.2.4, 4.4).

Module Detection

  1. (03/13) Hidden Markov Models for module detection. Paper presentation (Wen-Ting Lin):
  2. (03/15) Mid-term examination.
  3. (03/27) Paper presentation (by instructor):

Evolution

  1. (03/29) Evolution models (DEKM 8.1 - 8.3), calculating likelihood of alignment, reversibility, Metropolis algorithm for phylogenetic tree construction (DEKM 8.4), evolutionary models with gaps (DEKM 8.5).
  2. (04/03) Paper presentation (Josh Smith):
  3. (04/05) Evolutionary events - large repeat families, minisatellites, microsatellites. Gene duplications and pseudogenes. Tandem repeat detection (with statistics). Applications - sequence turnover. Implications for probabilistic sequence analysis algorithms. Powerpoint slides
  4. (04/10) Paper presentation: (by instructor)
  5. (04/12) To be decided.
  6. (04/17) Paper presentation: (by instructor) Powerpoint slides

Evolution and Motif finding

  1. (04/19) Paper presentation (by instructor): Powerpoint slides

Population Genetics

  1. (04/24) (Combined with 26.)
  2. (04/26) Wright Fisher model, random drift, with selection only, with mutations only, coalescence theory. Neutral sequence.

Project presentations

  1. (05/01)