Skip to main content

Table 1 Common statistical approaches used in metabolomics data analysis

From: Lessons learned from metabolomics in cystic fibrosis

 

Method

Purpose

Statistical assumptionsa

A. Methods that analyze each metabolite separately

Parametric methods

Paired t test

Compare two groups

Random sampling, normality, paired samples, no major outliers

Student t test

Compare two groups

Random sampling, normality, independent samples, equal variances, no major outliers

Welch t test

Compare two groups

Random sampling, normality, independent samples, unequal variances, no major outliers

Linear model

Compare two or more groups and with the possibility to control confounders

Random sampling, linearity, and additivity, errors are independent, homoscedastic, and follow normal distribution, no major outliers

Nonparametric methods

Wilcoxon signed rank test

Compare two groups

Random sampling, paired samples, differences between paired samples have symmetrical distribution

Mann-Whitney U test

Compare two groups

Random sampling, independent samples

Kruskal-Wallis ANOVA

Compare more than two groups

Random sampling, independent samples

B. Methods that analyze all of the metabolites simultaneously

Unsupervised classification methods

PCA

Detect major pattern in the data, detect outliers

Linearity

Supervised classification methods

PLS-DA

Find metabolites that best separate two or more study groups

Linearity, no major outliers

 

OPLS-DA

Find metabolites that best separate two or more study groups, with easier result interpretation than PLS-DA

Linearity, no major outliers

  1. aThe assumption of continuous data is not listed, because all of the metabolomics data are continuous data and meet this assumption