top of page

Predicting Functional Phosphosites with MAVE




Protein phosphorylation is a class of post-translational modifications (PTMs) that can control protein function and signaling pathways. However, identifying functional phosphosites remain challenging. Traditional experiments that engineer phospho-mimetic or phospho-inhibitory mutations are low-throughput and time-consuming. True functional labels for the majority of the detectable phosphosites remain missing. Recent advances in multiplexed assays of variant effect (MAVEs) enables biologists to simultaneously analyze multiple functional variants in a single assay. These MAVE datasets, coupled with cutting-edge machine learning (ML) models, present an exciting opportunity for identifying functional phosphosites.
In this project, we will (1) use existing MAVE dataset to provide functional annotations of all known phosphosites, (2) build machine learning models to predict functional phosphosites based on MAVE labels and residue labels (3) validate the ML models using new MAVE data.


Machine Learning, Method development

Project Stage


Avg. Hours / Week


Project Provider

Kuan Huang, PhD

Commitment (Months)


Spots Open


Project Lead

Megan Wojciechowicz

Publication if successful?


Trainee authorship criteria (if applicable)

Generation of meaningful and reproducible results that go into the final manuscript, including codes (models) and figures.

Preference given to those who can also work with variant files (see relevant skill assessment here:

Required Skill Assessments

Machine Learning in Python

Python for Data Science

Python Programming

bottom of page