1 The parameters.txt file

In this step, we create the parameters.txt file, that should be saved with the data.

This file contains the parameters for the model specification and fitting.

We restate hereafter the important pages to look at to understand these parameters and their effects on the model specifications and fits. Do not hesitate to play around these parameters to better understand them!

2 A complete example of the parameters.txt file.

To start, we present below an example of the file:

{
"Scaling" : "Scaling_ToDefine",
"modelType" : "Model_ToDefine",
"lengthscale_prior": None, 
"lengthscale_MinConstraint" : "lengthscale_Constraint_ToDefine", 
"mean" : "gpytorch.means.ZeroMean()", 
"control_condition": "ControlCondition_ToDefine", 
"training_iterations": 500 , 
"LearningRate" : 0.1,
"Amsgrad" : False, 
"n_PredictionsPoints" : 50, 
"PlotSave" : True, 
"prediction_type" : "predicted_functions",
"GPMelt_statistic" : "Statistic_ToDefine"
}

Note:

  1. These parameters cannot be left blank!
  2. The following values in the example below do not correspond to default values of the parameters and should be consciously chosen by the user: Scaling, modelType, lengthscale_MinConstraint, control_condition, GPMelt_statistic.

We now go step by step to explain this parameters.txt file.

4 Parameters for the model estimation

Following Gpytorch(Gardner et al. 2018) routine, we use Type II MLE to train the hyper-parameters of the full HGP model \(\mathcal{M}_1\). Some parameters of this algorithm can be tuned, see here, section 5.

4.1 Number of iterations

1'training_iterations': 500
1
A larger number of iterations might be required if the model is more complex (e.g. a large number of conditions).

4.2 Learning rate

Also see Gpytorch documentation for more information.

1'LearningRate': 0.1
1
Can be adjusted if needed

4.3 Whether to use the AMSGrad variant of the Adam algorithm

Also see the Adam documentation for more information.

1'Amsgrad' : False
1
Can be set to True or False

4.4 Number of points in which to predict the posterior mean and 95% confidence regions

See here, section 8, for a visualisation of how this number of points affect the prediction.

1'n_PredictionsPoints' : 50
1
Can be adjusted if needed

4.5 Application to ATP 2019

In (Le Sueur, Rattray, and Savitski 2024), the following values have been selected:

'training_iterations': 500
'LearningRate': 0.1, 
'Amsgrad' : False ,
'n_PredictionsPoints' : 50 

5 Parameters for the plots

5.1 Type of predictions for the fits plots

1"prediction_type" : "prediction_type_ToDefine"
1
Can take values predicted_functions or predicted_observations.

We refer to the GPyTorch documentation about GP regression:

  • predicted_functions : returns the model posterior distribution \(p(f* | x*, X, y)\), for training data \(X, y\). This posterior is the distribution over the function we are trying to model, and thus quantifies our model uncertainty.
  • predicted_observations : returns the posterior predictive distribution \(p(y* | x*, X, y)\) which is the probability distribution over the predicted output value \(\Rightarrow\) here the prediction is over the observed value of the test points.

5.2 Should the set of plots generated for each ID (monitoring convergence, depicting the fits and the covariance matrices of the full and joint models) be saved?

1'PlotSave' : True
1
Can be changed to False if needed

5.3 Application to ATP 2019

In (Le Sueur, Rattray, and Savitski 2024), the following values have been selected:

"prediction_type" : "predicted_functions",
'PlotSave' : True

7 The complete parameters.txt file for the ATP 2019

{
"Scaling" : None,
"modelType" : "3Levels_OneLengthscale_FixedLevels1and2and3",
"lengthscale_prior": None, 
"lengthscale_MinConstraint" : "min", 
"mean" : "gpytorch.means.ZeroMean()", 
"control_condition": "Vehicle", 
"training_iterations": 500 , 
"LearningRate" : 0.1,
"Amsgrad" : False, 
"n_PredictionsPoints" : 50, 
"PlotSave" : True, 
"prediction_type" : "predicted_functions",
"GPMelt_statistic" : "dataset-wise"
}

7.1 Save the parameters.txt file

The updated parameters.txt file should be saved in the folder Nextflow/dummy_data/ATP2019, using the name parameters.txt.

References

Gardner, Jacob, Geoff Pleiss, Kilian Q Weinberger, David Bindel, and Andrew G Wilson. 2018. “Gpytorch: Blackbox Matrix-Matrix Gaussian Process Inference with Gpu Acceleration.” Advances in Neural Information Processing Systems 31.
Le Sueur, Cecile, Magnus Rattray, and Mikhail Savitski. 2024. “GPMelt: A Hierarchical Gaussian Process Framework to Explore the Dark Meltome of Thermal Proteome Profiling Experiments.” PLOS Computational Biology 20 (9): e1011632.