Predicting Functional Phosphosites with MAVE
Protein phosphorylation is a class of post-translational modifications (PTMs) that can control protein function and signaling pathways. However, identifying functional phosphosites remain challenging. Traditional experiments that engineer phospho-mimetic or phospho-inhibitory mutations are low-throughput and time-consuming. True functional labels for the majority of the detectable phosphosites remain missing. Recent advances in multiplexed assays of variant effect (MAVEs) enables biologists to simultaneously analyze multiple functional variants in a single assay. These MAVE datasets, coupled with cutting-edge machine learning (ML) models, present an exciting opportunity for identifying functional phosphosites.
In this project, we will (1) use existing MAVE dataset to provide functional annotations of all known phosphosites, (2) build machine learning models to predict functional phosphosites based on MAVE labels and residue labels (3) validate the ML models using new MAVE data.
Machine Learning, Method development
Avg. Hours / Week
Kuan Huang, PhD
Publication if successful?
Trainee authorship criteria (if applicable)
Generation of meaningful and reproducible results that go into the final manuscript, including codes (models) and figures.
Preference given to those who can also work with variant files (see relevant skill assessment here: https://github.com/Bioinformatics-Research-Network/skill-assessments/tree/main/Working%20with%20Genomic%20Variant%20Files)