Fit a Partial Least Squares (PLS) model for regression (R) or discriminant analysis (DA), with automated cross-validated component selection.
Usage
pls(
X,
Y,
center = TRUE,
scale = "UV",
cv = list(method = "k-fold_stratified", k = 7, split = 2/3),
maxPCo = 5,
plotting = TRUE
)
Arguments
- X
A numeric matrix or data frame. Each row represents an observation, and each column a metabolic variable.
- Y
Response vector or matrix. Must match the number of rows in
X
.- center
Logical. Should data be mean-centered? Default is
TRUE
.- scale
Character. Scaling method:
"None"
,"UV"
(unit variance), or"Pareto"
.- cv
Named list specifying cross-validation settings:
method
Cross-validation type:
"k-fold"
,"k-fold_stratified"
,"MC"
, or"MC_balanced"
.split
Fraction of observations used for training (used in Monte Carlo CV).
k
Number of folds or repetitions.
- maxPCo
Integer. Maximum number of orthogonal components to test.
- plotting
Logical. If
TRUE
, model summary (e.g. R2X, Q2, AUROC) is plotted. Default isTRUE
.
Value
An object of class PLS_metabom8
, an S4 class with scores, loadings, predictions, and validation statistics.
Details
Cross-validation is used to select the optimal number of predictive components based on Q2 or AUROC. The method supports both regression and classification with binary or multi-class responses. Model interpretability is often best with pairwise class comparisons.
References
Geladi, P. & Kowalski, B.R. (1986). Partial least squares regression: a tutorial. Analytica Chimica Acta, 185, 1–17.
Examples
data(covid)
X <- covid$X
an <- covid$an
model <- pls(X, Y = an$type)
#> Performing discriminant analysis.
#> Reducing k to 5 due to small group size (min n = 5).
plotscores(model, an = list(Class = an$type, Clinic = an$hospital, id = 1:nrow(an)), pc = c(1, 2))