Speaker:Fan Yingying, University of Southern California,USA
Invited by:Prof. Zeng Shaoqun
Time:10:00-12:00, June 13, 2018
Venue: A101
Abstract:
Many contemporary large-scale applications involve building interpretable models linking a large set of potential covariates to a response in a nonlinearfashion, such as when the response is binary. Although this modeling problem has been extensively studied, it remains unclear how to effectively control the fraction of false discoveries even in high-dimensional logistic regression, not to mention general high-dimensional nonlinear models. To address such a practical problem, we propose a new framework of model-X knockoffs, which reads from a different perspective the knockoff procedure (Barber and Candès, 2015) originally designed for controlling the false discovery rate in linear models. Whereas the knockoffs procedure is constrained to homoscedastic linear models with n ≥ p, the key innovation here is that model-X knockoffs provide valid inference from finite samples in settings in which the conditional distribution of the response is arbitrary and completely unknown. Furthermore, this holds no matter the number of covariates. Correct inference in such a broad setting is achieved by constructing knockoff variables probabilistically instead of geometrically. This is a joint work with Emmanuel Candès, Yingying Fan and Lucas Janson.