Leichter, Carl Stuart
Principal Component Analysis (PCA) is one of the most popular and widely used techniques in data processing and analysis. In the domains of Magnetoencephalographic (MEG) analysis and Electroencephalographic (EEG) analysis, PCA is routinely used as a lossy reduction technique. Usually one or more principal components containing significant quantities of data variance/information are removed; the remaining components are then used to reconstruct an approximation of the original data. It is assumed that the variance/information discarded with the removed components mainly represents exogenous nuisance sources which have obscured the signals of interest, such as when emissions from main power lines obscure the electromagnetic emissions from a test subject’s brain. This conventionality also assumes that the primary risk of such reductions is losing some variance/information from the signals of interest; but this loss is an acceptable cost/risk for significantly improving the overall signal to noise ratio (SNR). Within this convention, we postulated that removing the first few principal components, then reconstructing the data, should significantly reduce the impact of distant nuisance sources, while only incurring very minor variance/information loss with respect to the signals of interest. Initial empirical tests of this lossy reconstruction strategy indicated significant improvements in the SNR. However, a careful review of the results also revealed unusual artifacts in the re- constructed data and we have invented the term “eigenspecter” to refer to these artifacts. To understand the nature of the eigenspecter artifacts, we derive a mathematical model of spatio-temporal source signal misallocation when data are reconstructed/approximated from lossy PCA reductions. This new model satisfactorily accounts for the eigenspecters we have discovered. The model also predicts that the presence of eigenspecters will result in the Misallocation of Signal Power Spectral Density (MSPSD), which can be used to detect the presence of eigenspecters in an empirical context. Using MSPSD in this manner, we were able empirically validate our model. We successfully used MSPSD to automatically identify eigenspecter candidates in lossy PCA reconstructions/approximations of phantom MEG data. We then used MSPSD to identify eigenspecter candidates in the lossy PCA reconstruction/approximations of EEG data from a Visual Event Related Potential experiment. In summary: we present an analytical model which predicts that eigenspecters are an inherent risk, whenever any data are reconstructed/approximate from lossy PCA reductions; then we validate the model in two separate empirical tests.
Advisor: Purvis, Martin; Deng, Jeremiah; Frantz, Liz; Paulin, Mike
Degree Name: Doctor of Philosophy
Degree Discipline: Information Science
Publisher: University of Otago
Keywords: Principal; Component; Analysis; PCA; Eigenspecter; Eigenspecters; Singular; Value; Decomposition; SVD; Magnetoencephalographic; MEG; Electroencephalographic; EEG
Research Type: Thesis