FSC1006, firstname.lastname@example.org, ext. 7030
Introduction to human microbiome data analysis (3 units)
Humans are colonized by many microorganisms. It is estimated that the average human body is inhabited by ten times as many non-human cells as human cells. The human microbiome refers specifically to the collective genomes of those resident microorganisms. The human-associated microbes form very complex and dynamic ecological communities, which have a crucial role in determining our health and well-being. The alterability of our microbiome offers a promising future for a variety of microbiome-based therapies such as ingesting probiotics or prebiotics, and fecal microbiota transplantation, in treating diseases associated with disrupted microbiota. In order to improve understanding of the microbial flora involved in human health and disease, the National Institutes of Health launched the Human Microbiome Project (HMP). In this project, students will explore the human microbiome datasets in HMP and different computational tools and software to analyze those data. This course aims to provide a guided study to those interested in the concepts of human microbiome and some basic skills of data analysis.
Machine-learning disease prediction model based on human gut microbiome (3/6 units)
Based on the lower-dimension representation of the gut microbiota compositions, we plan to build an interpretable machine-learning model for disease prediction. There is often a tradeoff between accuracy and intelligibility in prediction model or classifiers. On one hand, accurate models such as boosted trees, random forests, and neural networks usually are not intelligible, but just work like a “black box”. On the other hand, intelligible models such as logistic regression, naive-Bayes, and single decision trees often have very poor accuracy. This tradeoff sometimes limits the accuracy of models that can be applied in mission-critical applications such as health care where being able to understand, validate, edit, and trust a learned model is important. In this project, students will systematically compare different machine-learning models, gain a better understanding of their intrinsic natures, and figure out the balance point which has both high accuracy and interpretability.