Statistical analysis of multinomial data in complex datasets often requires estimation of the multivariate normal (MVN) distribution for models in which the dimensionality can easily reach 10–1000 and higher. Few algorithms for estimating the MVN distribution can offer robust and efficient performance over such a range of dimensions. We report a simulation-based comparison of two algorithms for the MVN that are widely used in statistical genetic applications. The venerable Mendell- Elston approximation is fast but execution time increases rapidly with the number of dimensions, estimates are generally biased, and an error bound is lacking. The correlation between variables significantly affects absolute error but not overall execution time. The Monte Carlo-based approach described by Genz returns unbiased and error-bounded estimates, but execution time is more sensitive to the correlation between variables. For ultra-high-dimensional problems, however, the Genz algorithm exhibits better scale characteristics and greater time-weighted efficiency of estimation.
Blondell, L.; Koz, M.Z.; Blangero, J.; Göring, H.H.H. Genz and Mendell-Elston Estimation of the High-Dimensional Multivariate Normal Distribution. Algorithms 2021, 14, 296. https://doi.org/10.3390/ a14100296
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Office of Human Genetics