Robust statistical procedures for high dimensional data based on distances
Brief description
High dimensional statistical inference has become a very hot area of research in recent times due to the exponentíal growth in the availability of data in millions of features (variables like genes, networks, etc.). The analysis of such datasets having number of observations (n) smaller than the number of features or variables (p) cannot be accomplished through the classical statistical methods and we need to develop advanced procedures and efficient computational algorithms for such datasets. Thís encompasses new procedures for (a) supervised regressíon and classification models where the number of covariates is of much larger order than n; (b) unsupervised settings such as clustering or graphical modeling with more variables than observations, or (c) multiple testing where the number of null hypotheses to be tested is larger than the sample size. High-dimensional data are prevalent in many domains of modern science such as genomic (Wu et al., 2009), neuroimaging (Jenatton et al ., 2012; Vu et al., 2011) and economics (Fan et al., 2011). Fan and Li (2006) presented a few problerns from various frontiers of research to illustrate the challenges of high-dimensionality: This included such areas as computational biology, health studies, financial engineering and rísk management, machine learning and datamining.
Researchers
- Leandro Pardo Llorente, (Profesional, Facultad de Matemáticas, UCM)
- Pedro Miranda Menéndez, (Profesional, Facultad de Matemáticas, UCM)
- Nirian Martín Apaolaza, (Profesional, Facultad de Matemáticas, UCM)
- Elena María Castilla González, (PhD student, Assistant researcher, Faculty of Mathematics, UCM)
External Collaborators
- Narayanaswamy Balakrishnan (McMaster University Hamilton, Ontario Canada)
- Ayan Basu (Indian Statistical Institute, India)
- Kostas Zografos (University of loannina Probability-Statistics & Operational Research Unit, Greece)
- Abhik Ghosh (Indian Statistical Institute, India)
Publications
- P. García-Segador, P. Miranda. On the Polytope of 3-Tolerant Fuzzy Measures. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 2023, 31,1. https://doi.org/10.1142/S0218488523500058
- A. Felipe, M. Jaenada, P. Miranda, L. Pardo. Restricted Distance-Type Gaussian Estimators Based on Density. Power Divergence and Their Applications in Hypothesis Testing. Mathematics. 2023, 11, 1480 480. https://doi.org/10.3390/math11061480
- N. Balakrishnan, E. Castilla, N. Martin, L. Pardo. Power divergence approach for one-shot device testing under competing risks. Journal of Computational and Applied Mathematics, 2023, 419. https://doi.org/10.1016/j.cam.2022.114676
- E. Castilla, P. J. Chocano. A New Robust Approach for Multinomial Logistic Regression With Complex Design Model. IEEE Transactions on Information Theory. 2022, 68 (11), pp. 7379-7395. https://doi.org/10.1109/TIT.2022.3187063
- E. Castilla, M. Jaenada, N. Martín, L. Pardo. Robust approach for comparing two dependent normal populations through Wald-type tests based on Rényi’s pseudodistance estimators. Statistics and Computing, 2022, 32(6), 100. https://doi.org/10.3390/e24050616
- A. Ghosh, M. Jaenada, L. Pardo. Classification of COVID19 Patients Using Robust Logistic Regression. Journal of Statistical Theory and Practice, 2022, 16(4), 67. https://doi.org/10.1007/s42519-022-00295-3
- N. Martín Apaolaza, El quincunx, un juego de azar de museo, Boletín del IMI, Nº45 (21 de abril de 2022), Sección "1+400. Divulgación con 1 imagen y 400 palabras". Link
- E. Castilla, M. Jaenada, L. Pardo. Estimation and testing on independent not identically distributed observations based on Rényi’s pseudodistances. IEEE Transactions on Information Theory. https://doi.org/10.1109/TIT.2022.3158308
- N. Balakrishnan, E. Castilla, M. H. Ling. Optimal designs of constant-stress accelerated life-tests for one-shot devices with model misspecification analysis. Quality and Reliability Engineering International. 2022, 38, 989-1012. https://doi.org/10.1002/qre.3031
- N. Balakrishnan, E. Castilla. EM-based likelihood inference for one-shot device test data under log-normal lifetimes and the optimal design of a CSALT plan. Quality and Reliability Engineering International. 2022, 38, 780-799. https://doi.org/10.1002/qre.3014
- M. Jaenada, L. Pardo. Robust Statistical Inference in Generalized Linear Models Based on Minimum Renyi’s Pseudodistance Estimators. Entropy. 2022, 24(1), 123. https://doi.org/10.3390/e24010123
- E. Castilla, N. Martín, L. Pardo. Testing linear hypotheses in logistic regression analysis with complex sample survey data based on phi-divergence measures. Communications in Statistics - Theory and Methods. 2021, 50, 22, 5228-5247. https://doi.org/10.
1080/03610926.2020.1746342 - P. Miranda, P. García-Segador. Pointed order polytopes: Studying geometrical aspects of the polytope of bi-capacities. Fuzzy Sets and Systems. 2021. https://doi.org/10.1016/j.fss.2021.11.001
- E. Castilla, K. Zografos. On distance-type Gaussian estimation. Journal of Multivariate Analysis. 2021, Article number 104831. https://doi.org/10.1016/j.
jmva.2021.104831 - E. Castilla, A. Ghosh, N. Martín, L. Pardo. Robust semiparametric inference for polytomous logistic regression with complex survey design. Advances in Data Analysis and Classification. 2021, 15, 701 – 734. https://doi.org/10.1007/s11634-020-00430-7
- L. Pardo, N. Martín. Robust Procedures for Estimating and Testing in the Framework of Divergence Measures. Entropy. 2021, 23(4):430. https://doi.org/10.3390/e23040430
- A. Basu, A. Ghosh, N. Martín, L. Pardo. A Robust Generalization of the Rao Test. Journal of Business and Economic Statistics. 2021. https://doi.org/10.1080/07350015.2021.1876711
- A. Calviño, N. Martín, L. Pardo. Robustness of Minimum Density Power Divergence Estimators and Wald-type test statistics in loglinear models with multinomial sampling. Journal of Computational and Applied Mathematics. 2021, 386, Article number 113214. https://doi.org/10.1016/j.cam.2020.113214
- N. Balakrishnan, E. Castilla, N. Martin, L. Pardo. Divergence-Based Robust Inference Under Proportional Hazards Model for One-Shot Device Life-Test. IEEE Transactions on Reliability. 2021, 70, 4, 1355-1367. https://doi.org/10.1109/TR.2021.3062289
- P. García-Segador, P. Miranda. Order cones: a tool for deriving k-dimensional faces of cones of subfamilies of monotone games. Annals of Operations Research. 2020, 295 (1), 117-137. https://link.springer.com/article/10.1007/s10479-020-03712-7
- J. M. Alonso-Revenga, N. Martín, L. Pardo. New statistics to test log-linear modeling hypothesis with no distributional specifications and clusters with homogeneous correlation. Journal of Computational and Applied Mathematics. 374, 112757. https://doi.org/10.1016/j.cam.2020.112757
- N. Balakrishnan, E. Castilla, N. Martín, L. Pardo. Robust inference for one-shot device testing data under exponential lifetime model with multiple stresses. Qual Reliab Engng Int. 2020, 1–15. https://doi.org/10.1002/
qre.2665 - N. Balakrishnan, E. Castilla, N. Martín, L. Pardo. Robust Inference for One-Shot Device Testing Data Under Weibull Lifetime Model. IEEE TRANSACTIONS ON RELIABILITY. 2020, 69, 3 , 937-953. https://doi.
org/10.1109/TR.2019.2954385 - A. Basu, A. Ghosh, A. Mandal, N. Martin, L. Pardo. Robust Wald-type tests in GLM with random design based on minimum density power divergence estimators. Statistical Methods & Applications. 2020. https://doi.org/10.1007/s10260-020-00544-4
- E. Castilla, N. Martín, S. Muñoz, L. Pardo. Robust Wald-type tests based on minimum Rényi pseudodistance estimators for the multiple linear regression model. Journal of Statistical Computation and Simulation. 2020, 90, 14. https://doi.org/10.1080/00949655.2020.1787410
- E. Castilla, N. Martın, L. Pardo. Testing linear hypotheses in logistic regression analysis with complex sample survey data based on phi-divergence measures. Communications in Statistics - Theory and Methods. 2020. https://doi.org/10.1080/
03610926.2020.1746342 - E. Castilla, N. Martín, L. Pardo, K. Zografos. Model Selection in a Composite Likelihood Framework Based on Density Power Divergence. Entropy. 2020, 22(3), 270. https://doi.org/10.3390/e22030270
News
- 7 de junio de 2023. Maria Jaenada, member of IMI and PhD student of Professors N. Balakrishnan and L. Pardo, has received the "Best Young Statistician Award", from the Greek Statistical Society, during the thirty-fifth congress of the said society for the paper "Step-stress test experiments under interval censored data with lognormal lifetime distribution" held at the West Atica University of Athens from May 25 to 28 of this month of May.
Congratulations Maria!
- 21 de noviembre de 2022. 5 miembros del IMI y un miembro de su comité científico en los "RANKINGS de los 95 más importantes investigadores en MATEMÁTICAS residentes en España" publicados por el Grupo DIH en noviembre de 2022. En la Categoría de "MATHEMATICS" 4 de nuestros investigadores están entre los 14 primeros (de una lista de 24), siendo Juan Luis Vázquez Suárez el número 1, Jesús Ildefonso Díaz Díaz el número 8, Julián López Gómez el número 9 y Juan Benigno Seoane Sepúlveda el número 14. En la categoría de "STATISTICS & PROBABILITY" Laureano Escudero Bueno (miembro del Comité Científico del IMI) está en la posición 10 y nuestro compañero Leandro Pardo Llorente está en la número 18 (de una lista de 21). Enhorabuena a los seis. El ranking se va actualizando periódicamente y se puede ver en estelink.
- 30 de junio de 2021. Fundación BBVA. El poder de la estadística. Entrevistas en video a galardonados con los Premios SEIO – Fundación BBVA 2020 y con las Medallas de la SEIO. Entre ellos Leandro Pardo (miembro del IMI) y Laureano Escudero (Miembro del Comité Asesor externo del IMI).