CARS: Covariate Assisted Ranking and Screening for Large-Scale Two-Sample Inference

 

Tony Cai, Wenguang Sun and Weinan Wang

 

Summary. Two-sample multiple testing has a wide range of applications. The conventional practice is to first reduce the original observations to a vector of p-values and then choose a cutoff to adjust for multiplicity. However, the data reduction step could cause significant loss of information and thus lead to suboptimal testing procedures. In this paper, we introduce a new framework for two-sample multiple testing by constructing primary and auxiliary variables from the original data and incorporating both in the inference procedure to improve the power. A data-driven multiple testing procedure is developed by employing a covariate-assisted ranking and screening (CARS) approach that optimally combines the information from both the primary and auxiliary variables. 

 

The proposed CARS procedure is shown to be asymptotic valid with proper control of the false discovery rate (FDR). The CARS procedure is implemented in the R-package CARS. Numerical results confirm the effectiveness of CARS in FDR control and show that it achieves substantial power gain over existing methods. The CARS procedure is also illustrated through an application to analyze a time course satellite imaging data set for supernova detection.

 

The paper and web appendix can be downloaded here. 

 

The R package for implementing the proposed CARS procedure is available from CRAN:

https://cran.r-project.org/web/packages/CARS/index.html