Computes the difference statistic $$W_j = |Z_j| - |\tilde{Z}_j|$$ where \(Z_j\) and \(\tilde{Z}_j\) are measure the importance of the jth variable and its knockoff, respectively, based on the stability of their selection upon subsampling of the data.

stat.stability_selection(X, X_k, y, fitfun = stabs::lars.lasso, ...)

Arguments

X

n-by-p matrix of original variables.

X_k

n-by-p matrix of knockoff variables.

y

response vector (length n)

fitfun

fitfun a function that takes the arguments x, y as above, and additionally the number of variables to include in each model q. The function then needs to fit the model and to return a logical vector that indicates which variable was selected (among the q selected variables). The name of the function should be prefixed by 'stabs::'.

...

additional arguments specific to 'stabs' (see Details).

Value

A vector of statistics \(W\) of length p.

Details

This function uses the stabs package to compute variable selection stability. The selection stability of the j-th variable is defined as its probability of being selected upon random subsampling of the data. The default method for selecting variables in each subsampled dataset is stabs::lars.lasso().

For a complete list of the available additional arguments, see stabs::stabsel().

Examples

set.seed(2024)
n=80; p=100; k=10; Ac = 1:k; Ic = (k+1):p
X = generate_X(n=n,p=p)
y <- generate_y(X, p_nn=k, a=3)
Xk = create.shrink_Gaussian(X = X, n_ko = 10)
res1 = knockoff.filter(X, y, Xk, statistic = stat.stability_selection,
                       offset = 1, fdr = 0.1)
res1
#> Call:
#> knockoff.filter(X = X, y = y, Xk = Xk, statistic = stat.stability_selection, 
#>     fdr = 0.1, offset = 1)
#> 
#> Selected variables:
#> integer(0)
#> 
#> Frequency of selected variables from 10 knockoff copys:
#>   [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#>  [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#>  [75] 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
perf_eval(res1$shat,Ac,Ic)
#> [1] 0 0