Phospho-TPP: a peptide-level dataset

1 Introduction

In this file, we discuss the analyses of the peptide-level dataset, called the Phospho-TPP dataset in (Le Sueur, Rattray, and Savitski 2024). This dataset comes from Potel et al. (2021).

In (Le Sueur, Rattray, and Savitski 2024), we performed two types of analyses for this dataset. The first one using a three-level HGP model and the second using a four-level HGP model.

The goal of this page is only to specify the choice of the parameters.txt file, and the results won’t be discussed here (see (Le Sueur, Rattray, and Savitski 2024) for a discussion about the results).

2 Data loading and preprocessing.

The dataset can be downloaded from Zenodo DOI.

Data are in folders PhosphoTPP/data and PhosphoTPP/prerun.

The preprocessing of the data is done in the preprocessing file found on the gitlab repository: gpmelt/Analysis/ATP2019/PhosphoTPP_DataPreparation.qmd.

3 The complete parameters.txt file for the three-level HGP model.

1"Scaling" : "mean",
2"modelType" : "3Levels_TwoLengthscales_FixedLevels1and2_FreeLevel3",
"lengthscale_prior": None, 
3"lengthscale_MinConstraint" : "max",
"mean" : "gpytorch.means.ZeroMean()", 
"control_condition": "Non-phospho", 
"training_iterations": 700 , <4>
"LearningRate" : 0.1, 
"Amsgrad" : True, <5>
"n_PredictionsPoints" : 50, 
"PlotSave" : True, 
"prediction_type" : "predicted_functions",
6"GPMelt_statistic" : "ID-wise"
1
We use the mean scaling on this dataset, which presents about half of non-sigmoidal melting curves.
2
We use two lengthscales, as peptide-level replicates present fast variations. We also allow each replicate to have a different output-scale to capture larger variations originating from a higher level of noise in peptide-level TPP-TR dataset.
3
Because peptide-level observations are more noisy, we favor larger lengthscales to obtain smoother melting curves.
6
A final size of \(S = 1e4\) has been chosen. Note: We also propose a group-wise analysis in (Le Sueur, Rattray, and Savitski 2024), see Supporting Information B, paragraph “The choice of the null distribution approximation: an example”, and FigS in S1 file.

4 The complete parameters.txt file for the four-level HGP model.

Note: The code describing the preprocessing of the data for this analysis has not been added to the gitlab repository yet.

"Scaling" : "mean",
1"modelType" : "4Levels_TwoLengthscales_FixedLevels1and2and3_FreeLevel4",
"lengthscale_prior": None, 
"lengthscale_MinConstraint" : "max",
"mean" : "gpytorch.means.ZeroMean()", 
2"control_condition": "Control",
"training_iterations": 500 , 
"LearningRate" : 0.05,
"Amsgrad" : False, 
"n_PredictionsPoints" : 50, 
"PlotSave" : True, 
"prediction_type" : "predicted_functions",
3"GPMelt_statistic" : "ID-wise"
1
We use two lengthscales, as peptide-level replicates present fast variations. We also allow each replicate to have a different output-scale to capture larger variations originating from a higher level of noise in peptide-level TPP-TR dataset.
2
We compared any peptide to any other peptide. Because this feature is not implemented yet, we defined the dataset such that any peptide appears once as Control with all other peptides as treatment conditions..
3
A final size of \(S = 1e4\) has been chosen.

References

Le Sueur, Cecile, Magnus Rattray, and Mikhail Savitski. 2024. “GPMelt: A Hierarchical Gaussian Process Framework to Explore the Dark Meltome of Thermal Proteome Profiling Experiments.” PLOS Computational Biology 20 (9): e1011632.
Potel, Clément M, Nils Kurzawa, Isabelle Becher, Athanasios Typas, André Mateus, and Mikhail M Savitski. 2021. “Impact of Phosphorylation on Thermal Stability of Proteins.” Nature Methods 18 (7): 757–59.