Abstract
Principal component regression (PCR) is a popular technique in data analysis and machine learning. However, the technique has two limitations. First, the principal components (PCs) with the largest variances may not be relevant to the outcome variables. Second, the lack of standard error estimates for the unstandardized regression coefficients makes it hard to interpret the results. To address these two limitations, we propose a model-based approach that includes two mean and covariance structure models defined for multivariate PCR. By estimating the defined models, we can obtain inferential information that will allow us to test the explanatory power of individual PCs and compute the standard error estimates for the unstandardized regression coefficients. A real example is used to illustrate our approach, and simulation studies under normality and nonnormality conditions are presented to validate the standard error estimates for the unstandardized regression coefficients. Finally, future research topics are discussed.