Abstract:
Purpose
This paper aims mainly at introducing applied statisticians and econometricians to the current research methodology with non-Euclidean data sets. Specifically, it provides the basis and rationale for statistics in Wasserstein space, where the metric on probability measures is taken as a Wasserstein metric arising from optimal transport theory.
Design/methodology/approach
The authors spell out the basis and rationale for using Wasserstein metrics on the data space of (random) probability measures.
Findings
In elaborating the new statistical analysis of non-Euclidean data sets, the paper illustrates the generalization of traditional aspects of statistical inference following Frechet's program.
Originality/value
Besides the elaboration of research methodology for a new data analysis, the paper discusses the applications of Wasserstein metrics to the robustness of financial risk measures.
References:
- Artstein, Z. and Wets, R.J.B. (1995), “Consistency of minimizers and the SLLN for stochastic programs”, Journal Convex Analysis, Vol. 2 No. 1, pp. 1-17.
- Bernton, E., Jacob, P.E., Gerber, M. and Robert, C.P. (2019), “On parameter estimation with the Wasserstein distance”, Information and Inference: A Journal of the IMA, Vol. 8 No. 4, pp. 657-676.
- Bhat, S.P. and Prashanth, L.A. (2019), “Concentration of risk measures: a Wasserstein distance approach”, Proceedings of the 33rd International Conference on Neural Information Processing Systems.
- Bigot, J. (2020), “Statistical data analysis in the Wasserstein space”, ESAIM: Proceedings and Surveys, Vol. 68, pp. 1-19, doi: 10.1051/proc/202068001.
- Billingsley, P. (1995), Probability and Measure, John Wiley & Sons, New York.
- Breiman, L. (2001), “Statistical modeling: the two cultures”, Statistical Science, Vol. 16 No. 3, pp. 199-215, doi: 10.1214/ss/1009213726.
- Chartier, B. (2013), “Necessary and sufficient condition for the existence of a Frechet mean on the circle”, ESAIM: Probability and Statistics, Vol. 17, pp. 635-649, doi: 10.1051/ps/2012015.
- Dobrushin, R.L. (1970), “Prescribing a system of random variables by conditional distributions”, Theory of Probability and Its Applications, Vol. XV No. 3, pp. 458-486, doi: 10.1137/1115049.
- Frechet, M. (1906), “Sur quelque poins du calcul fonctionel”, Thesis, 1906, Rend. Circ. Matem. Palermo (XXII), Paris.
- Frechet, M. (1948), “Les elements aleatoires de nature quelconque dans un espace distancé”, in Annales de l'institut Henri Poincaré, Vol. 10, pp. 215-310.
- Frechet, M. (1956), “Sur les tableaux de correlation dont les marges sont donne’es”, in Comptes Rendus de l'Académie des Sciences, Paris, (242), pp. 2426-2428.
- Frechet, M. (1957), “Sur la distance de deux lois de probabilite”, in Comptes Rendus de l'Académie des Sciences, Paris, (244), pp. 689-692.
- Kiesel, R., Ruhlicke, R., Stahl, G. and Zheng, J. (2016), “The Wasserstein metric and robustness in risk management”, Risk, Vol. 4 No. 4, p. 32, doi: 10.3390/risks4030032.
- Levy, P. (1937), Theorie de ’l”Addition des Variables Aleatoires, Gauthier-Villars, Paris.
- Mallows, C. (1972), “A note on asymptotic joint normality”, The Annals of Mathematical Statistics, Vol. 43 No. 2, pp. 508-515, doi: 10.1214/aoms/1177692631.
- Morizet, N. (2020), “Introduction to generative adversarial Networks”, Technical report, Advestis, hal-02899937.
- Panaretos, V.M. and Zemel, Y. (2020), An Invitation to Statistics in Wasserstein Space, Springer.
- Parthasarathy, K.R. (1967), Probability Measures on Metric Spaces, Academic Press, New York.
- Shalev-Shwartz, S. and Ben-David, S. (2014), Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, New York.
- Shorack, G.R. and Wellner, J.A. (1986), Empirical Processes with Applications to Statistics, John Wiley & Sons, New York.
- Vallender, S.S. (1973), “Calculation of the Wasserstein distance between distributions on the line”, Theory of Probability and Its Applications Vol. 18, pp. 784-786.
- Vassershtein, L.N. (1969), “Markov process over denumbrable product of spaces describing large systems of automata”, Problems of Information Transmission, Vol. 5 No. 3, pp. 47-52.
- Villani, C. (2003), Topics in Optimal Transportation, American Mathematical Society.