This is an announcement for the paper "How close is the sample covariance matrix to the actual covariance matrix?" by Roman Vershynin.
Abstract: Given a distribution in R^n, a classical estimator of its covariance matrix is the sample covariance matrix obtained from a sample of N independent points. What is the optimal sample size N = N(n) that guarantees estimation with a fixed accuracy in the operator norm? Suppose the distribution is supported in a centered Euclidean ball of radius \sqrt{n}. We conjecture that the optimal sample size is N = O(n) for all distributions with finite fourth moment, and we prove this up to an iterated logarithmic factor. This problem is motivated by the optimal theorem of Rudelson which states that N = O(n \log n) for distributions with finite second moment, and a recent result of Adamczak, Litvak, Pajor and Tomczak-Jaegermann which guarantees that N = O(n) for sub-exponential distributions.
Archive classification: math.PR math.FA math.ST stat.TH
Mathematics Subject Classification: 60H12, 60B20, 46B09
Remarks: 34 pages
Submitted from: romanv@umich.edu
The paper may be downloaded from the archive by web browser from URL
http://front.math.ucdavis.edu/1004.3484
or