inria-00566721, version 2
Tight conditions for consistent variable selection in high dimensional nonparametric regression
Laëtitia Comminges a, 1, 2Arnak Dalalyan b, 1, 2
COLT - 24th Conference on Learning Theory - 2011 (2011) 19 p.
Résumé : We address the issue of variable selection in the regression model with very high ambient dimension, \textit{i.e.}, when the number of covariates is very large. The main focus is on the situation where the number of relevant covariates, called intrinsic dimension, is much smaller than the ambient dimension. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. These conditions relate the intrinsic dimension to the ambient dimension and to the sample size. The procedure that is provably consistent under these tight conditions is simple and is based on comparing the empirical Fourier coefficients with an appropriately chosen threshold value.
- a – Université Paris EST
- b – Ecole des Ponts ParisTech
- 1 : Laboratoire d'Informatique Gaspard-Monge (LIGM)
- Université Paris-Est Marne-la-Vallée (UPEMLV) – ESIEE – Ecole des Ponts ParisTech – Fédération de Recherche Bézout – CNRS : UMR8049
- 2 : IMAGINE
- CSTB – Ecole des Ponts ParisTech – Université Paris-Est Créteil Val-de-Marne (UPEC)
- Domaine : Mathématiques/Statistiques
Statistiques/Théorie
- Versions disponibles : v1 (17-02-2011) v2 (17-02-2011)
- inria-00566721, version 2
- http://hal.inria.fr/inria-00566721
- oai:hal.inria.fr:inria-00566721
- Contributeur : Arnak Dalalyan
- Soumis le : Jeudi 17 Février 2011, 09:33:50
- Dernière modification le : Mardi 14 Février 2012, 13:58:41