YKnock_filter.Rd
This function runs the model-Y Knockoffs procedure from start to finish, selecting important responses.
YKnock_filter(
X,
Y,
knockoffs = create_YKnock,
statistic = stat_modelY_coef,
fdr = 0.1,
offset = 1
)
n-by-p matrix or data frame of predictors.
n-by-r matrix or data frame of responses.
method used to construct knockoffs for the \(Y\) variables. It must be a function taking a n-by-r matrix as input and returning a n-by-r matrix of knockoff variables. By default, approximate model-Y Gaussian knockoffs are used.
statistics used to assess variable importance. By default, a lasso statistic with cross-validation is used. See the Details section for more information.
target false discovery rate (default: 0.1).
either 0 or 1 (default: 1). This is the offset used to compute the rejection threshold on the statistics. The value 1 yields a slightly more conservative procedure ("knockoffs+") that controls the false discovery rate (FDR) according to the usual definition, while an offset of 0 controls a modified FDR.
An object of class "YKnock". This object is a list containing at least the following components:
matrix of original responses
matrix of knockoff responses
computed test statistics
computed selection threshold
named vector of selected responses
This function creates the knockoffs, computes the importance statistics, and selects responses. It is the main entry point for the YKnock package.
The parameter knockoffs
controls how knockoff variables are created.
By default, the model-Y scenario is assumed and a multivariate normal distribution
is fitted to the original variables \(Y\).
The default importance statistic is stat_modelY_coef.
It is possible to provide custom functions for the knockoff constructions or the importance statistics. Some examples can be found in the vignette.
Identification of Significant Gene Expression Changes in Multiple Perturbation Experiments using Knockoffs
Tingting Zhao, Guangyu Zhu, Patrick Flaherty bioRxiv 2021.10.18.464822;
r = 100; p = 10; n = 40
m = 10 # num of important response
rho = 0.2
betaValue = 1
SigmaX=matrix(rho,p,p)
diag(SigmaX)=1
betaA=matrix(sample(c(-1,1)*betaValue,size=m*p, replace = TRUE), nrow = m,ncol = p)
beta=matrix(0,r,p)
beta[1:m,]=betaA
sigma=1
X = matrix(rnorm(n*p),n)%*%chol(SigmaX)
Y = X %*% t(beta) + sqrt(sigma)*matrix(rnorm(n*r),n,r)
set.seed(1)
result=YKnock_filter(X, Y)
print(result$selected)
#> integer(0)