===========CS 461b/661b: Bioinformatics Tools and Applications========= TIME: Monday 3:30-5:30 pm, Wednesday 2:30-3:30pm LOCATION: MC320. Wednesday class may be changed to a computer lab later on. INSTRUCTOR: Bin Ma OFFICE HOURS: Tuesday 4-5pm, Wednesday 4-5pm TA: TBA EVALUATION: Two assignments with minor programming work and report writing (10 marks each); and One project work with major programming, report writing, and presentation (30 marks); and One final exam (50 marks). DESCRIPTION: This course provides an introduction to several common types bioinformatics software tools used in biology research. The necessary biology background, the computational methods (algorithms), and the user interface will be introduced for each software tool studied in the course. Students will have hand-on experience of several software tools and gain some experience on developing bioinformatics software through a course project. PREREQUISITES: Computer Science 331a/b and 340a/b; Biochemistry 280a is recommended. * You must have either the requisites for this course or written special permission from your Dean to enroll in this course. TEXTBOOK: (Recommended but not required) Computational Molecular Biology: An Algorithmic Approach by Pavel A. Pevzner TENTATIVE TOPICS: 1. Whole-Genome-Shotgun Sequencing a. restriction enzyme b. clone c. sequencing machine d. genome assembly e. DIY: genome assembly 2. PCR and primer design a. PCR b. PCR primer design c. DIY: design a pair of primers for a gene 3. Gene prediction a. gene related diseases and gene test b. codon and codon bias c. prokaryotes gene structure d. HMM gene prediction e. eukaryotes gene structure f. DIY: find a gene using GenScan 4. NCBI Entrez a. data available at NCBI b. DIY: get some data from NCBI 5. NCBI BLAST a. basic idea of BLAST b. BLOSUM matrix and E-value c. DIY: download and run BLAST 6. Spaced Seeds and PatternHunter a. spaced seeds and variants b. DIY: compare the sensitivity of spaced seeds and consecutive 7. Mass Spectrometry a. Instruments b. isotope peaks and complexity of a spectrum c. Mass fingerprint and MOWSE score d. DIY: predict the isotopes of a molecule e. DIY: de-convolute a spectrum f. DIY: use mass spec data to identify a protein 8. Tandem Mass Spectrometry a. Instruments b. data c. de novo sequencing v.s. database search d. DIY: use PEAKS to identify peptides and proteins 9. SPIDER a. SPIDER b. DIY: use SPIDER to identify proteins 10. LC-MS/MS a. liquid chromography b. retention time prediction c. DIY: predict retention time of peptides 11. Protein quantitation a. ICAT b. ITRAQ c. label-free d. DIY: analyze a spectrum to compute the quantity of a peptide 12. Next generation sequencing a. Solexa b. 454 c. direct read d. direct read + hybridization PROJECTS TO CHOOSE FROM: 1. prokaryote gene prediction program * software should use start-codon, promoter, and codon-bias * compare performance with other software, generate a report and analysis * suggest future direction 2. retention time prediction program * collect LC/MS/MS data from internet * use PEAKS to identify the confident peptides * use those peptides as training data to train the hydrophobicity of each amino acid * compare the predicted and experimental retention time, generate report * suggest future direction 3. genome assembler * accept many sequences * assemble them together as a long genome 4. motif finding * accept many protein sequences * find motif 5. homology based gene prediction program * software uses BLAST to search homologies in EST database * report those matched ORF as potential genes 6. propose your own projects (talk to me before you start)