Hierarchical models: concepts

1 Introduction

Although originally developed to model TPP-TR datasets, GPMelt is broadly applicable to any time-series datasets with replicates and conditions. For this reason, the naming conventions used in GPMelt have been designed to be general.

The GPMelt framework is based on the idea of translating the experimental design parameters of an experiment (e.g. a biological protocol) into a hierarchy. By experimental design parameters, we refer to information such as replicates, conditions, batches, and more.

To translate these experimental design parameters into a hierarchy, the user needs to identify each level of the hierarchy, with Level_1 being the top of the hierarchy and Level_L the bottom level, where \(L\) is the number of levels in the hierarchy (e.g. \(L=3\) for a three-level hierarchical model, \(L=4\) for a four-level model).

2 The example of a simple TPP-TR experimental design

We illustrate in Video 1 how to translate a standard TPP-TR experimental design into levels of the hierarchy.

Video 1: A visual explanation of how to translate a standard TPP-TR experiment into a hierarchical model.

It is important to understand that the hierarchy translates prior knowledge about expected similarities between observations. Returning to the example from Video 1:

  • The bottom level of the hierarchy (Level_3) corresponds to the individual replicates. These are the leaves of the hierarchy: observations measured through a replicate (e.g., at different temperatures for TPP-TR or different time points for a time-series dataset) are expected to be strongly correlated. The leaves of the hierarchy describe the minimal unit to be modeled: here each modeled curve is a replicate.
  • Observations from replicates within the same condition are expected to be similar, and this similarity can be captured by measuring correlations between replicates. Thus, the second level of the hierarchy (Level_2) corresponds to the conditions.
  • Finally, observations from replicates across different conditions but within the same ID may share more similarities than those from different IDs. Typically, if the treatment condition(s) have no effect, we would expect the melting curves across all conditions to be similar and therefore strongly correlated. For this reason, the top level of the hierarchy (Level_1) is chosen to be the protein \(P\).

3 A hierarchy = a modeling assumption

Important: Video 1 describes one possible translation of the data into a hierarchy. If you have different hypotheses about the data, the levels of the hierarchy should be adjusted accordingly.

  • For example, if you aim to compare the melting curves across proteins within a given condition, you would define Level_2 as the proteins and Level_1 as the conditions. This approach assumes that most proteins share similar melting behaviour within a condition, and this assumption would be reflected in the hierarchy.