Description of course project
The course project will allow the student to do some independent study in the area of Bioinformatics. The list of possible project topics, together with up 2-3 starting references is below. Please feel free to add papers to this list with my approval.
The project work will be assessed through:
- 3-4 page long written report submitted no later than December 7 (single spacing, 11 point font size),
- 15-min oral presentation on date TBA.
The project assessment will be based on the following criteria:
- written report : content (coverage, depth of understanding reflected) - 10%
- written report : quality of writing (structure, syntax, proper citations, etc.) - 5%
- oral presentation (clarity and general understanding) - 10%
Unless you are working on a problem already with me, you should pick one area and read 2-3 papers in that area. Your presentation and report should be based on these papers.
Papers
[These are all available electronically].
- DNA and machine learning. Robert (Alzheiumer detection), Saad (Cancer detection)
- Machine learning and genome annotation: a match meant to be?
Kevin Y Yip, Chao Cheng and Mark Gerstein,
link
- Unsupervised pattern discovery in human chromatin structure through genomic segmentation.
Hoffman MM, Buske OJ, Wang J, Weng Z, Bilmes JA, Noble WS.
link.
-
Machine Learning Concepts and Tools for Statistical Genomics,
link.
-
DNA and information retrieval.
-
A Term Association Approach for Genomics
Information Retrieval link
- Information Retrieval meets Gene Analysis link.
- Omic Data Modelling for Information Retrieval
link.
-
Bayesian networks and its applications in Bioinformatics. Jessica
-
A Primer on Learning in Bayesian Networks for Computational Biology link
- A Bayesian network model for protein fold and remote homologue recognition,
link
-
BioBayesNet: a web server for feature extraction and Bayesian network modeling of biological sequence data
link
-
RNA, DNA and Protein in Computer Science
- Quake: quality-aware detection and correction of sequencing errors David R Kelley, Michael C Schatz, Steven L Salzberg link
- DECOD: fast and accurate discriminative
DNA motif finding, Huggins P, Zhong S, Shiff I, Beckerman R, Laptenko, Prives , Schulz MH, Simon I, Bar-Joseph Z, link.
- Probabilistic error correction for RNA sequencing, link.
- String Algorithms; string similarities and matching. Rafay
- Fast Algorithms for Top-k Approximate String Matching, link.
- A Fast Algorithm for Approximate String Matching on Gene Sequences,
link.
-
Others
-
A beginner's guide to eukaryotic genome annotation, Mark Yandell, Daniel Encelink.
-
Information retrieval from biological databases, Andreas D. Baxevanis,link.
- Bayesian methods in bioinformatics and computational systems biology, link.
- Clustering Algorithms: Dong
- Repeat detection : Jun Lin
- Cancer and Systems Biology: Jonathan