Modeling changes in cell populations

Martin Modrák

Charles University, Second Faculty of Medicine

Motivating example

Motivating example II

  • peripheral blood T‑cells
    • separately CD4+ and CD8+
  • children newly diagnosed with type 1 diabetes (30)
    • also 1 year after
  • healthy donors +/- matched (13)

Cell populations

  • taken as given
  • cell populations identified (globally) for all samples
    • Louvain method, hierarchical structure
    • 3 levels, although 1st somewhat degenerate

Cell populations

Cell populations

Cell populations

Assume the populations and their structure are correctly identified

Compositional data

# of cells fixed by design

We only learn the relative abundance

Also a potential problem for RNA-seq

Hierarchical structure

Must be taken into account

Change w.r.t. total vs. change w.r.t. parent?

Compositional structure

Perfect world: measure the total number of cells in a given volume before library prep

As in RNA-seq, if data nice, can be ignored.

Mixed model - Intuition

Intuition: change in NK cells overall and then separately in each subpopulation.

Decompose variance. Shrink effects.

Mixed model - Single group, 2 levels

\[ \begin{aligned} \log\mu_{p,s} &= \text{overall_mean}_p + \text{normalization}_s \\\ &+ \text{high_level}_{h[p]} + \text{low_level}_p \\ \text{high_level}_i &\sim N(0, \sigma_\text{high}) \\ \text{low_level}_i &\sim N(0, \sigma_\text{low}) \end{aligned} \]

Mixed model - Group differences

\[ \begin{aligned} \log\ &\mu_{p,s} = \text{overall_mean}_p + \text{normalization}_s \\ & + \text{high_level}_{h[p]} + \text{low_level}_p \\ & + \text{is_trt}_s \times \\ & \ \ \ \ (\text{overall_trt} + \text{high_level_trt}_{h[p]} + \text{low_level_trt}_p )\\ &\begin{pmatrix} \text{high_level}_i \\ \text{high_level_trt}_i\end{pmatrix}\sim MVN(0, \Sigma_\text{high}) \\ &\begin{pmatrix} \text{low_level}_i \\ \text{low_level_trt}_i \end{pmatrix} \sim MVN(0, \Sigma_\text{low}) \end{aligned} \]

Mixed model - Interpretation

One model, multiple questions

Easier in Bayesian approach

Some results - high level

Some results - low level vs. parent

Some results - low level vs. total

Some results - more low level

Open questions

Continous cell states vs. discrete populations

E.g. the “proliferating” populations