agg_BH.Rd
Performs AKO (Aggregation of Multiple Knockoffs)
agg_BH(Ws_mat, fdr = 0.1, offset = 0, gamma = 0.3)
A matrix of test statistics from multiple knockoff filters, where each row represents one set of test statistics and each column represents a variable.
A numeric value of the target false discovery rate (FDR) level. Default is \(0.1\).
An integer (0 or 1) specifying the offset in the empirical p-value calculation. Default is \(0\).
A numeric value for quantile aggregation in the multiple knockoff p-value aggregation. Default is \(0.3\).
A vector shat
containing the indices of selected variables after aggregating knockoff results.
Calculate intermediate p-value \(\pi_j^{(b)}\), for all \(j \in [p]\) and \(b \in [B]\): $$ \pi_j = \begin{cases} \frac{1 + \#\left\{k: W_k \leq -W_j \right\}}{p}, & \text{if } W_j > 0 \\ 1, & \text{if } W_j \leq 0 \end{cases}$$
Aggregate using the quantile aggregation procedure (Meinshausen et al. 2009): $$ \bar{\pi}_j = \min \left\{1, \frac{q_\gamma\left(\left\{\pi_j^{(b)}: b \in [B]\right\}\right)}{\gamma}\right\} $$
Control FDR using Benjamini-Hochberg step-up procedure (BH, Benjamini & Hochberg 1995):
Order p-values: \(\bar{\pi}_{(1)} \leq \bar{\pi}_{(2)} \leq \ldots \leq \bar{\pi}_{(p)}\).
Find: \(\widehat{k}_{BH} = \max \left\{k: \bar{\pi}_{(k)} \leq \frac{k \alpha}{p}\right\}\).
Select: \(\widehat{\mathcal{S}} = \left\{j \in [p]: \bar{\pi}_{(j)} \leqslant \bar{\pi}_{\left(\widehat{k}_{BH}\right)}\right\}\).
Nguyen TB, Chevalier JA, Thirion B, Arlot S. Aggregation of multiple knockoffs. In: International Conference on Machine Learning. PMLR; 2020. p. 7283–93.
Tian P, Hu Y, Liu Z et al. Grace-AKO: a novel and stable knockoff filter for variable selection incorporating gene network structures. BMC Bioinformatics#' @keywords internal
set.seed(2024)
p = 100; n = 80
X = generate_X(n=80,p=100)
y <- generate_y(X, p_nn=10, a=3)
Xk = create.shrink_Gaussian(X = X, n_ko = 10)
res1 = knockoff.filter(X, y, Xk, statistic = stat.glmnet_coefdiff,
offset = 1, fdr = 0.1)
res1
#> Call:
#> knockoff.filter(X = X, y = y, Xk = Xk, statistic = stat.glmnet_coefdiff,
#> fdr = 0.1, offset = 1)
#>
#> Selected variables:
#> [1] 1 2 3 4 5 6 7 8 9 10
#>
#> Frequency of selected variables from 10 knockoff copys:
#> [1] 10 10 10 10 10 8 10 10 9 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [26] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [51] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [76] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
agg_BH(res1$Ws)
#> [1] 1 2 3 4 5 6 7 8 9 10