2019-2020 Final Year Project Titles
Dr. Liang Tian
FSC1006, firstname.lastname@example.org, ext. 7030
Introduction to human microbiome data analysis (3 units)
Humans are colonized by many microorganisms. It is estimated that the average human body is inhabited by ten times as many non-human cells as human cells. The human microbiome refers specifically to the collective genomes of those resident microorganisms. The human-associated microbes form very complex and dynamic ecological communities, which have a crucial role in determining our health and well-being. The alterability of our microbiome offers a promising future for a variety of microbiome-based therapies such as ingesting probiotics or prebiotics, and fecal microbiota transplantation, in treating diseases associated with disrupted microbiota. In order to improve understanding of the microbial flora involved in human health and disease, the National Institutes of Health launched the Human Microbiome Project (HMP). In this project, students will explore the human microbiome datasets in HMP and different computational tools and software to analyze those data. This course aims to provide a guided study to those interested in the concepts of human microbiome and some basic skills of data analysis.
Machine-learning disease prediction model based on human gut microbiomics (3/6 units)
Based on the lower-dimension representation of the gut microbiota compositions, we plan to build an interpretable machine-learning model for disease prediction. There is often a tradeoff between accuracy and intelligibility in prediction model or classifiers. On one hand, accurate models such as boosted trees, random forests, and neural networks usually are not intelligible, but just work like a “black box”. On the other hand, intelligible models such as logistic regression, naive-Bayes, and single decision trees often have very poor accuracy. This tradeoff sometimes limits the accuracy of models that can be applied in mission-critical applications such as health care where being able to understand, validate, edit, and trust a learned model is important. In this project, students will systematically compare different machine-learning models, gain a better understanding of their intrinsic natures, and figure out the balance point which has both high accuracy and interpretability.
(3) Universal assembly principles of human microbial community (3/6 units)
A variety of microbial communities exist throughout the human body, which play a crucial role in determining our health and well-being. Surveys of human microbiome revealed that the taxonomic composition of the microbial community is highly diverse and personalized, while its functional capacity is strikingly conserved across individuals. This observation implies a universal organizational principle exist in microbial community assembly. Although this organizational principle is thought to underlie the stability and resilience of the human microbiome, its origin is elusive. This project aims at identifying the organizational principle in the human microbial community by characterizing its genomic content network --- a bipartite network that links microbes to the genes in their genomes. The students will use a semi-automated scheme that decomposes a large genomic content network into subnetworks to facilitate systematic analysis of the network structure and its assembly rules. An interpretable genome evolution model will be built to fit the real data, which explains how selective pressures result in the observed microbial community assembly. This project aims at providing a mechanistic understanding of human microbial community assembly, which is critical for developing functional-based diagnostics and therapeutics.