Workshop on random matrix theory and high dimensional statistics for complex system

University of Luxembourg, Department of Mathematics.

August 29 - September 1, 2023.

Chiara Amorino: Minimax rate for multivariate data under componentwise local differential privacy constraints
Abstract: In this talk, we analyse the balance between maintaining privacy and preserving statistical accuracy when dealing with multivariate data that is subject to componentwise local differential privacy (CLDP). With CLDP, each component of the private data is made public through a separate privacy channel. This allows for varying levels of privacy protection for different components or for the privatization of each component by different entities, each with their own distinct privacy policies. We develop general techniques for establishing minimax bounds that shed light on the statistical cost of privacy in this context, as a function of the privacy levels $\alpha_1$, ... , $\alpha_d$ of the d components. We demonstrate the versatility and efficiency of these techniques by presenting various statistical applications. Specifically, we examine nonparametric density and covariance estimation under CLDP, providing upper and lower bounds that match up to constant factors, as well as an associated data-driven adaptive procedure. Furthermore, we quantify the probability of extracting sensitive information from one component by exploiting the fact that, on another component which may be correlated with the first, a smaller degree of privacy protection is guaranteed.
This is based on a joint work with A. Gloter.
Zhigang Bao: Phase Transition of Eigenvectors for Spiked Random Matrices
Abstract: ln this talk, we will first provide an overview of recent findings concerning eigenvectors of random matrices under fixed-rank deformation. We will then shift our focus towards analyzing the li mit distribution of the leading eigenvectors of deformed models in the critical regime of the Baik-Ben Arous-Peche (BBP) phase transition. The distribution is determined by a determinantal point process with an extended Airy kernel. This result can be seen as an eigenvector counterpart to the BBP eigenvalue phase transition.
The talk will be based on a joint work with Dong Wang.
Yannick Baraud: A new look at Bayesian Statistics
Abstract: We address the problem of estimating the distribution of presumed i.i.d. observations within the framework of Bayesian statistics. To do this, we consider a statistical model for the distribution of the data as well as a prior on it and we propose a new posterior distribution that shares some similarities with the classical Bayesian one. ln particular, when the statistical model is exact, we show that this new posterior distribution concentrates its mass around the target distribution, just as the classical Bayesian posterior would do under appropriate assumptions. Nevertheless, we establish that this concentration property holds under weaker assumptions than those generally required for the classical Bayesian posterior. Specifically, we do not require that the prior distribution allocates sufficient mass on Kullback-Leibler neighbourhoods but only on the larger Hellinger ones. More importantly, unlike the classical Bayesian distribution, ours proves to be robust against a potential misspecification of the prior and the assumptions we started from. We prove that the concentration properties we establish remain stable when the equidistribution assumption is violated or when the data are i.i.d. with a distribution that does not belong to the model but only lies close enough to it. The results we obtain are nonasymptotic and involve explicit numerical constants.
Denis Belomestny: Provable Benefits of Policy Learning from Human Preferences
Abstract: A crucial task in reinforcement learning (RL) is a reward construction. It is common in practice that no obvious choice of reward function exists. Thus, a popular approach is to introduce human feedback during training and leverage such feedback to learn a reward function. Among all policy learning methods that use human feedback, preference-based methods have demonstrated substantial success in recent empirical applications such as InstructGPT. In this work, we develop a theory that provably shows the benefits of preference-based methods in tabular and linear MDPs. The main idea of our method is to use KL-regularization with respect to the learned policy to ensure more stable learning.
Benoît Collins: On the norm of random matrices with a tensor structure
Abstract: Random matrices with tensor structures are important in many areas, including operator algebras, artificial intelligence, graph theory, etc. An important problem is to establish limit theorems for the operator norm of models obtained from algebraic operations involving multiple copies of such random tensors. This talk will describe more precisely relevant questions and recent progress and applications.
It is primarily based on collaborations with Charles Bordenave.
Arnak Dalalyan: Langevin Monte Carlo: Randomized mid-point method revisited
Abstract: Langevin Monte Carlo is an efficient and widely used method for generating random samples from a given target distribution in a high-dimensional Euclidean space. Various variants of the Langevin Monte Carlo method have been proposed and discussed in the literature; depending on the properties of the target distribution, some variants may be preferred ta others. Among these variants, it has been shown that the Randomized Mid point Langevin Monte Carlo (RMP-LCM) method has the best known non-asymptotic theoretical guarantees on the sampling error, when the log-density of the target distribution has a continuous Lipschitz gradient. The objective of this talk is ta review these results, as well as ta present some extensions and improvements.
Charlotte Dion-Blanc: Multiclass classification for Hawkes process
Abstract: We investigate the multiclass classification problem where the features are event sequences. More precisely, the data are assumed to be generated by a mixture of simple linear Hawkes processes. ln this new setting, the classes are discriminated by various triggering kernels. A challenge is then to build an efficient classification procedure. We derive the optimal Bayes rule and provide a two-step estimation procedure of the Bayes classifier. ln the first step, the weights of the mixture are estimated; in the second step, an empirical risk minimization procedure is performed to estimate the parameters of the Hawkes processes. We establish the consistency of the resulting procedure and derive rates of convergence. Then, we tackle the case of multivariate Hawkes processes. The challenge here is the highdimension of the classification problem which can be solved using a LASSO-type step in the procedure.
Joint work with Christophe Denis and Laure Sansonnet.
Gonçalo Dos Reis: New results on the simulation of mean field equations: super-measure growth and the non-Markovian Euler schemes
Abstract: ln this talk, we review the state of the art and caver recent developments in the simulation of mean-field equations of McKean Vlasov type. On one part we discuss mean-field diffusions under super-linear growth assumptions on the equation's coefficients. This class of equations appear ubiquitously in interacting-particle system modelling. We discuss the phenomena of particle corruption in the simulations and illustrate our findings with a range of examples. ln the second part of the talk, we discuss a recent method dubbed 'non-Markovian Euler scheme' that, although an Euler type scheme of standard weak error rate of 1, is able to attain a weak convergence rate of 2 for the invariant distribution.
Clement Hardy: Prediction and testing of mixtures of features issued from a continuous dictionary
Abstract: ln this talk, we will consider observations that are random elements of a Hilbert space resulting from the sum of a deterministic signal and a noise. The signais considered will be linear combinations (or mixtures) of a finite number of features issued from continuous parametric dictionaries.
ln order to estimate the linear coefficients as well as the non-linear parameters of a mixture in the presence of noise, we propose estimators that are solutions to an optimization problem. We shall quantify the performance of these estimators with respect to the quality of the observations by establishing prediction and estimation bounds that stand with high probability. ln practice, it is common to have a set of observations (possibly a continuum) sharing common features. The question arises whether the estimation of signais can be improved by taking advantage of their common structure. We give a framework in which this improvement occurs.
Next, we shall test whether a noisy observation is derived from a given signal and give nonasymptotic upper bounds for the associated testing risk. ln particular, our test encompasses the signal detection framework. We will derive an upper bound for the strength that a signal must have in order to be detected in the presence of noise.
This presentation is based on joint work with C.Butucea, J-F. Delmas and A. Dutfoy.
Johannes Moritz Jirak: Weak dependence and optimal quantitative self-normalized central limit theorems
Abstract: Motivated from highdimensional problems, we revisit estimation of the long-run variance, subject to dependence structures. More precisely, consider a stationary, weakly dependent sequence of random variables. Given that a CLT holds, how should we estimate the long-run variance? This problem has been studied for decades, prominent proposed solutions were given for instance by Andrews {1991) or Newey and West {1994). Using the proximity of the corresponding normal distribution as quality measure, we discuss optimal solutions and why previous proposais are not optimal in this context.
The setup contains many prominent dynamical systems and time series models, including random walks on the general linear group, products of positive random matrices, functionals of Garch models of any order, functionals of dynamical systems arising from SDEs, iterated random functions and many more.
Christophe Ley: Advances in statistics via tools from Stein's Method
Abstract: Stein's Method is becoming increasingly popular in statistics and machine learning. ln this talk, I will describe how various components from the famous Stein Method, a well-known approach in probability theory for approximation problems, have been recently put to successful use in theoretical and computational statistics.
Yingying Li: Estimating Efficient Frontier with Ali Risky Assets
Abstract: We propose a method ta estimate the efficient frontier with all risky assets under a high-dimensional setting. The method utilizes linear constrained LASSO based on an equivalent constrained regression representation of the mean-variance optimization. Under a mild sparsity assumption, we show that our estimator asymptotically achieves mean-variance efficiency. Extensive simulation and empirical studies are conducted ta examine the performance of our proposed estimator.
Based on joint work with Leheng Chen and Xinghua Zheng.
Zeng Li: Robust estimation of number of factors in high dimensional factor modeling via Spearman's rank correlation matrix
Abstract: Determining the number of factors in high-dimensional factor modeling is essential but challenging, especially when the data are heavy-tailed. ln this paper, we introduce a new estimator based on the spectral properties of Spearman's rank correlation matrix under the highdimensional setting, where bath dimension and sample size tend to infinity proportionally. Our estimator is applicable for scenarios where either the common factors or idiosyncratic errors follow heavy-tailed distributions. We prove that the proposed estimator is consistent un der mild conditions. Numerical experiments also demonstrate the superiority of our estimator compared to existing methods, especially for the heavy-tailed case.
Dmytro Marushkevyc: Parametric statistical inference for high-dimensional diffusions
Abstract: This talk is dedicated to the problem of parametric estimation in the diffusion setting and mostly concentrated on properties of the Lasso estimator of drift component. More specifically, we consider a multivariate parametric diffusion model X observed continuously over the interval [0,T] and investigate drift estimation under sparsity constraints. We allow the dimensions of the model and the parameter space to be large. We obtain an oracle inequality for the Lasso estimator and derive an error bound for the L2-distance using concentration inequalities for linear functionals of diffusion processes. The probabilistic part is based upon elements of empirical processes theory and, in particular, on the chaining method. Sorne alternative estimation procedures, such as adaptive and relaxed Lasso will also be discussed to give a perspective on improving the obtained results.
Felix Parraud: Free stochastic calculus and Random Matrix Theory
Abstract: Recently we developed a method to compute asymptotic expansions of certain quantities coming from Random Matrix Theory. One can then use those results to study the spectral properties of polynomials in random matrices. This method relies notably on free stochastic calculus. In this talk I shall introduce basic notions of this theory and show how it naturally appears when studying random matrix stochastic processes in large dimension. I will then expiain how to apply that theory to Random Matrix Theory.
Giovanni Peccati: Quantitative CLTs in deep neural networks and coupling of Gaussian fields
Abstract: Fully connected random neural networks are fascinating examples of random fields, obtained by hierarchically juxtaposing layers of computational units - sometimes referred to as neurons. Since the pioneering work of Neal (1996), it is known that neural networks exhibit Gaussian behavior in the so-called "large-width limit", that is when the sizes of the layers simultaneously diverge to infinity. One crucial question - which has been relatively little explored in the literature - is how to measure the distance between the distribution of a fixed neural network and its Gaussian counterpart. ln this talk, I will explain how one can obtain probabilistic bounds on such discrepancy - featuring an algebraic dependence on the network's width - by exploiting (a) Stein's method (in the case of finite-dimensional approximations), and (b) some estimates on the optimal coupling of Gaussian fields (in the case of functional approximations).
Based on joint work with S. Favaro, B. Hanin, D. Marinucci, and I. Nourdin.
Vincent Rivoirard: Bayesian nonparametric inference for nonlinear Hawkes processes
Abstract: Hawkes processes are a specific class of point processes modeling the probability of occurrences of an event depending on past occurrences. Hawkes processes are therefore naturally used when one is interested in graphs for which the temporal dimension is essential. ln the linear framework, the statistical inference of Hawkes processes is now well known. We will therefore focus more specifically on the class of nonlinear multivariate Hawkes processes that allow to model bath excitation and inhibition phenomena between nodes of a graph. We will present the Bayesian nonparametric estimation of the parameters of the Hawkes model and the posterior contraction rates obtained on Holder classes. From the practical point of view, since simulating posterior distributions is often out of reach in reasonable time, especially in the mutlivariate framework, we will more specifically use the variational Bayesian approach which provides a direct and fast computation of an approximation of the posterior distributions allowing the analysis in reasonable time of graphs containing several tens of neurons.
Joint work with Déborah Sulem and Judith Rousseau.
Judith Rousseau: Semi-parametric inference : A Bayesian curse?
Abstract: ln this talk I will discuss some issues around Bayesian approaches in semiparametric inference. I will first recall some positive and negative results on Bernstein von Mises theorems in non and semi-parametric models. I will then propose two possible tricks to derive posteriortype distributions in semiparametric models which allow both for efficient procedures and Bernstein von Mises theorems, as well as flexible priors on the nonparametric part. The first approach, based on the eut posterior will be illustrated in semi-parametric mixture and Hidden Markov models and second, a targeted posterior, will be applied in the well known causal inference problem of average treatment effect estimation.
This talk is build on joint works with Edwin Fong, Chris Holmes, Dan Moss and Andrew Viu.
Claudia Strauch: Change point estimation for a stochastic heat equation
Abstract: We study a change point model based on a stochastic partial differential equation (SPDE) corresponding to the heat equation governed by the weighted Laplacian $\Delta_\vartheta = \nabla\vartheta\nabla$, where $\theta=\theta(x)$ is a space-dependent diffusivity, on the domain (0,1) with Dirichlet boundary conditions. Based on local measurements of the solution in space with resolution $\delta$ over a finite time horizon, we develop a simultaneous M-estimator for the diffusivity parameters $\theta_\pm$ and the change point $\tau$ characterizing the piecewise constant diffusivity $\vartheta$. We work in the general setting where the parameters $\theta_\pm$ are allowed to vary with the resolution $\delta$. The change point estimator converges at rate $\delta$, while the diffusivity constants can be recovered with convergence rate $\delta^{3/2}$. Moreover, when the diffusivity parameters are known and the jump height vanishes with the spatial resolution tending to zero, we derive a limit theorem for the change point estimator and identify the limiting distribution as one familiar from the change point literature. For the mathematical analysis, a precise understanding of the SPDE with discontinuous $\theta$, tight concentration bounds for quadratic functionals in the solution, and a generalisation of classical M-estimators are developed.
Based on joint work with Markus Reiss and Lukas Trottner.
Martin Wahl: A kernel-based analysis of Laplacian eigenmaps
Abstract: Laplacian eigenmaps and diffusion maps are nonlinear dimensionality reduction methods that use the eigenvalues and eigenvectors of normalized graph Laplacians. From a mathematical perspective, the main problem is to understand these empirical Laplacians as spectral approximations of the underlying Laplace-Beltrami operator. ln this talk, we study Laplacian eigenmaps through the lens of kernel PCA. This leads to novel points of view and allows to leverage results for empirical covariance operators in infinite dimensions.
Qinwen Wang: Asymptotics of robust estimators of scatter in high-dimension
Abstract: ln this talk, we will investigate the limiting spectral properties of two robust estimators of scatter: the sample spatial-sign covariance matrix and Tyler's M estimator, in high dimensional scenarios. The populations under study are general to include the independent components model and the family of elliptical distributions. These may corne with known or unknown location vectors. Both the empirical spectral distributions and the central limit theorems for a class of linear spectral statistics of the two matrix ensembles are studied.
Jeff Jianfeng Yao: Limiting distributions for eigenvalues of sample correlation matrices from heavy-tailed populations
Abstract: Consider a $p$-dimensional population $x\in \mathbb{R}^p$ with iid coordinates that are regularly varying with index $\alpha\in (0,2)$. Since the variance of $x$ is infinite, the diagonal elements of the sample covariance matrix $S_n=n^{-1}\sum_{i=1}^n {x_i}x'_i$ based on a sample $x_1,\ldots, x_n$ from the population tend to infinity as $n$ increases and it is of interest to use instead the sample correlation matrix $R_n= \{\mathrm{diag}(S_n)\}^{-1/2}\, S_n\{\mathrm{diag}(S_n)\}^{-1/2}$. This paper finds the limiting distributions of the eigenvalues of $R_n$ when both the dimension $p$ and the sample size $n$ grow to infinity such that $p/n\to \gamma \in (0,\infty)$. The family of limiting distributions $\{H_{\alpha,\gamma}\}$ is new and depends on the two parameters $\alpha$ and $\gamma$. The moments of $H_{\alpha,\gamma}$ are fully identified as sum of two contributions: the first from the classical Mar\v{c}enko-Pastur law and a second due to heavy tails. Moreover, the family $\{H_{\alpha,\gamma}\}$ has continuous extensions at the boundaries $\alpha=2$ and $\alpha=0$ leading to the Mar\v{c}enko-Pastur law and a modified Poisson distribution, respectively. Our proofs use the method of moments, a path-shortening algorithm and some novel graph counting combinatorics.
This is a joint work with Johannes Heiny (Stockholm University).
Wangjun Yuan: On spectrum of sample covariance matrices from large tensor vectors
Abstract: ln this talk, we study the limiting spectral distribution of sums of independent rank-one large k-fold tensor products of large n-dimensional vectors. In the literature, the limiting moment sequence is obtained for the case k = o(n) and k = O(n). Under appropriate moment conditions on base vectors, it has been showed that the eigenvalue empirical distribution converges to the celebrated Marcenko-Pastur law if k = O(n) and the components of base vectors have unit modulus, or k = o(n). ln this talk, we study the limiting spectral distribution by allowing k to grow much faster, whenever the components of base vectors are complex random variables on the unit circle. lt turns out that the limiting spectral distribution is Marcenko-Pastur law. Comparing with the existing results, our limiting setting only requires $k \to \infty$. Our approach is based on the moment method.
Nikita Zhivotovskiy: Mean and Covariance Matrix Estimation for Anisotropie Distributions in the Presence of Outliers
Abstract: Suppose we are observing a sample of independent random vectors, knowing that the original distribution was contaminated, so that a fraction of observations came from a different distribution. How to estimate the mean and the covariance matrix of the original distribution in this case? In this talk, we discuss some recent estimators that achieve the optimal nonasymptotic, dimension-free rate of convergence under the model where the adversary can corrupt a fraction of the samples arbitrarily. The discussion will cover a range of distributions including specifically Gaussian, sub-Gaussian, and heavy-tailed distributions.