前回は主成分分析を説明した。この手法は、
p変量(p次元)の観測値をm個(m次元)の主成分に縮約させる方法であった。
この意味では、因子分析も同じような方法であるのだが、
主成分分析の場合は、
データの散らばり方を捉えてデータ特性を把握する手法である。
今回説明する因子分析は、変数間に(潜在的な)構造を持ち込んで関係を探る手法である
(少し理解しにくいかも)。
この手法は心理学の分野で広く利用されている。
/* Lesson 15-1 */ /* File Name = fact01.sas 10/16/97 */ data food; : infile 'food.dat'; : ファイルの読み込み input X01-X10; : 変量リスト、連続的に label X01='M(-15)' : 各変量に解りやすい名前を付ける X02='M(16-20)' : M : 男性 X03='M(21-30)' : F : 女性 X04='M(31-40)' : ()内 : 年齢 X05='M(41-)' : X06='F(-15)' : X07='F(16-20)' : X08='F(21-30)' : X09='F(31-40)' : X10='F(41-)'; : : proc print data=food(obs=10); : データの表示 run; : proc factor data=food; : 因子分析 var X01-X10; : 解析に使う変量リスト run; :
SAS システム 1 12:10 Thursday, October 16, 1997 OBS X01 X02 X03 X04 X05 X06 X07 X08 X09 X10 1 7.69 7.31 7.47 7.76 7.87 7.51 7.24 7.70 7.91 7.95 2 6.59 5.56 6.21 6.04 5.81 6.64 6.11 6.53 6.44 6.64 3 4.55 4.18 4.36 4.25 4.53 4.60 3.66 4.04 3.68 4.43 4 6.78 6.11 6.30 5.98 5.56 6.37 6.29 5.43 5.32 5.28 5 6.47 6.24 6.02 5.42 5.88 6.00 5.60 4.60 5.40 5.95 6 6.96 6.81 6.91 6.48 6.23 7.09 7.27 7.13 6.86 7.36 7 6.57 5.70 5.89 5.16 5.30 6.07 5.56 4.50 4.92 5.33 8 7.32 6.95 6.02 4.98 4.88 6.82 6.40 5.53 5.61 5.33 9 6.51 6.15 5.51 4.68 4.16 5.17 4.81 4.70 4.86 3.82 10 6.86 6.05 5.85 6.14 6.75 6.71 5.39 5.42 6.03 6.59 SAS システム 2 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Prior Communality Estimates: ONE Eigenvalues of the Correlation Matrix: Total = 10 Average = 1 1 2 3 4 5 Eigenvalue 6.8280 1.7619 0.7545 0.2624 0.1216 Difference 5.0661 1.0074 0.4921 0.1408 0.0236 Proportion 0.6828 0.1762 0.0754 0.0262 0.0122 Cumulative 0.6828 0.8590 0.9344 0.9607 0.9728 6 7 8 9 10 Eigenvalue 0.0980 0.0721 0.0441 0.0358 0.0219 Difference 0.0259 0.0280 0.0083 0.0139 Proportion 0.0098 0.0072 0.0044 0.0036 0.0022 Cumulative 0.9826 0.9898 0.9942 0.9978 1.0000 SAS システム 3 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components 2 factors will be retained by the MINEIGEN criterion. SAS システム 4 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Factor Pattern FACTOR1 FACTOR2 X01 0.74741 -0.59244 M(-15) X02 0.86579 -0.31836 M(16-20) X03 0.84491 0.22079 M(21-30) X04 0.78216 0.47602 M(31-40) X05 0.68129 0.67325 M(41-) X06 0.80647 -0.54140 F(-15) X07 0.89959 -0.33542 F(16-20) X08 0.90901 -0.04289 F(21-30) X09 0.90316 0.21817 F(31-40) X10 0.79262 0.35477 F(41-) SAS システム 5 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Variance explained by each factor FACTOR1 FACTOR2 6.827955 1.761873 Final Communality Estimates: Total = 8.589828 X01 X02 X03 X04 X05 0.909618 0.850950 0.762624 0.838371 0.917413 X06 X07 X08 X09 X10 0.943520 0.921775 0.828147 0.863298 0.754112
/* Lesson 15-2 */ /* File Name = fact02.sas 10/16/97 */ data food; infile 'food.dat'; input X01-X10; label X01='M(-15)' X02='M(16-20)' X03='M(21-30)' X04='M(31-40)' X05='M(41-)' X06='F(-15)' X07='F(16-20)' X08='F(21-30)' X09='F(31-40)' X10='F(41-)'; proc print data=food(obs=10); run; : proc factor data=food nfactor=3 out=fscore; : 因子数3、出力の保存 var X01-X10; : run; : proc plot data=fscore; : plot factor1*factor2/vref=0.0 href=0.0; : 第1因子 x 第2因子、軸 plot factor2*factor3/vref=0.0 href=0.0; : 第2因子 x 第3因子、軸 run; :
SAS システム 8 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components 3 factors will be retained by the NFACTOR criterion. SAS システム 9 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Factor Pattern FACTOR1 FACTOR2 FACTOR3 X01 0.74741 -0.59244 0.16808 M(-15) X02 0.86579 -0.31836 0.29190 M(16-20) X03 0.84491 0.22079 0.38417 M(21-30) X04 0.78216 0.47602 0.32604 M(31-40) X05 0.68129 0.67325 0.11067 M(41-) X06 0.80647 -0.54140 -0.07270 F(-15) X07 0.89959 -0.33542 -0.14888 F(16-20) X08 0.90901 -0.04289 -0.25110 F(21-30) X09 0.90316 0.21817 -0.27989 F(31-40) X10 0.79262 0.35477 -0.45389 F(41-) SAS システム 10 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Variance explained by each factor FACTOR1 FACTOR2 FACTOR3 6.827955 1.761873 0.754451 Final Communality Estimates: Total = 9.344279 X01 X02 X03 X04 X05 0.937870 0.936157 0.910210 0.944673 0.929662 X06 X07 X08 X09 X10 0.948805 0.943939 0.891197 0.941637 0.960129 SAS システム 11 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Scoring Coefficients Estimated by Regression Squared Multiple Correlations of the Variables with each Factor FACTOR1 FACTOR2 FACTOR3 1.000000 1.000000 1.000000 SAS システム 12 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Standardized Scoring Coefficients FACTOR1 FACTOR2 FACTOR3 X01 0.10946 -0.33626 0.22279 M(-15) X02 0.12680 -0.18069 0.38691 M(16-20) X03 0.12374 0.12531 0.50920 M(21-30) X04 0.11455 0.27018 0.43215 M(31-40) X05 0.09978 0.38212 0.14670 M(41-) X06 0.11811 -0.30729 -0.09636 F(-15) X07 0.13175 -0.19038 -0.19733 F(16-20) X08 0.13313 -0.02434 -0.33282 F(21-30) X09 0.13227 0.12383 -0.37099 F(31-40) X10 0.11609 0.20136 -0.60162 F(41-) SAS システム 13 12:10 Thursday, October 16, 1997 プロット : FACTOR1*FACTOR2. 凡例: A = 1 OBS, B = 2 OBS, ... 5 + | | | FACTOR1 | | | A A |A B A | A A A AA BBA AAADA BB A AA A A A 0 +----A-------A----AAB--AAA----ACA---BABAAA-B--AAAAA--AA-A--A--A- | A AA A A A A A A A | ABAAABB BA A A A | A A A | A A | | A | | -5 + | --+-----------+-----------+-----------+-----------+-----------+- -3 -2 -1 0 1 2 FACTOR2 SAS システム 14 12:10 Thursday, October 16, 1997 プロット : FACTOR2*FACTOR3. 凡例: A = 1 OBS, B = 2 OBS, ... FACTOR2 | | 2.5 + | | A A A A | AC A A A | AA ABA ADABA AB| B A AA AA A 0.0 +---BA----AAAA-CBA--A-+-ECB-A------A----A--------A-------------- | A A BAA ABB AB AABAB | B AA B AAAAA A A -2.5 + | A | | | | -5.0 + | --+---------+---------+---------+---------+---------+---------+- -2 -1 0 1 2 3 4 FACTOR3
/* Lesson 15-3 */ /* File Name = fact03.sas 10/16/97 */ data food; infile 'food.dat'; input X01-X10; label X01='M(-15)' X02='M(16-20)' X03='M(21-30)' X04='M(31-40)' X05='M(41-)' X06='F(-15)' X07='F(16-20)' X08='F(21-30)' X09='F(31-40)' X10='F(41-)'; proc print data=food(obs=10); run; proc factor data=food nfactor=3 rotate=varimax out=fscore2; var X01-X10; : 回転の指定 run; : proc plot data=fscore2; plot factor1*factor2/vref=0.0 href=0.0; plot factor2*factor3/vref=0.0 href=0.0; plot factor3*factor1/vref=0.0 href=0.0; run;
SAS システム 20 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Orthogonal Transformation Matrix 1 2 3 1 0.65751 0.53576 0.52976 2 -0.73452 0.61238 0.29234 3 0.16779 0.58134 -0.79617 SAS システム 21 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Rotated Factor Pattern FACTOR1 FACTOR2 FACTOR3 X01 0.95480 0.13534 0.08893 M(-15) X02 0.85209 0.43859 0.13319 M(16-20) X03 0.45782 0.81121 0.20628 M(21-30) X04 0.21933 0.90009 0.29393 M(31-40) X05 -0.02799 0.84163 0.46962 M(41-) X06 0.91574 0.05827 0.32684 F(-15) X07 0.81289 0.19001 0.49704 F(16-20) X08 0.58706 0.31477 0.66894 F(21-30) X09 0.38662 0.45477 0.76508 F(31-40) X10 0.18442 0.37804 0.88499 F(41-) SAS システム 22 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Variance explained by each factor FACTOR1 FACTOR2 FACTOR3 3.923686 2.875550 2.545044 Final Communality Estimates: Total = 9.344279 X01 X02 X03 X04 X05 0.937870 0.936157 0.910210 0.944673 0.929662 X06 X07 X08 X09 X10 0.948805 0.943939 0.891197 0.941637 0.960129 SAS システム 23 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Scoring Coefficients Estimated by Regression Squared Multiple Correlations of the Variables with each Factor FACTOR1 FACTOR2 FACTOR3 1.000000 1.000000 1.000000 SAS システム 24 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Standardized Scoring Coefficients FACTOR1 FACTOR2 FACTOR3 X01 0.35634 -0.01776 -0.21769 M(-15) X02 0.28101 0.18221 -0.29369 M(16-20) X03 0.07475 0.43906 -0.30323 M(21-30) X04 -0.05062 0.47805 -0.20440 M(31-40) X05 -0.19046 0.37274 0.04777 M(41-) X06 0.28720 -0.18091 0.04945 F(-15) X07 0.19335 -0.16071 0.17125 F(16-20) X08 0.04957 -0.13707 0.32839 F(21-30) X09 -0.06623 -0.06897 0.40164 F(31-40) X10 -0.17252 -0.16424 0.59935 F(41-) SAS システム 25 12:10 Thursday, October 16, 1997 プロット : FACTOR1*FACTOR2. 凡例: A = 1 OBS, B = 2 OBS, ... 2 + A | A | A AA A A | A A FACTOR1 | A AA |A A A A A | A B AA AB B C A A A A A | A A A B | A AA A A A 0 +-----------------A-A---------B-+AA-AA--A-AA-------------------- | A B AA AA| A A A A A | A AA A|A C AA | A A AA A AA B | A A A | A A | AA A -2 + A A A| A --+---------+---------+---------+---------+---------+---------+- -3 -2 -1 0 1 2 3 FACTOR2 SAS システム 26 12:10 Thursday, October 16, 1997 プロット : FACTOR2*FACTOR3. 凡例: A = 1 OBS, B = 2 OBS, ... FACTOR2 | | 4 + | | | | A A | 2 + A | A A | A AA A | AAA A A A | A A A A A AAA |ABB BAB A B AA 0 +--------------------------A------B--D--AA+ACB-AAABAB-A--------- | A A AAA A C| BA BA C A A | A A CA A | A B AB A -2 + A A AA | --+---------+---------+---------+---------+---------+---------+- -4 -3 -2 -1 0 1 2 FACTOR3 SAS システム 27 12:10 Thursday, October 16, 1997 プロット : FACTOR3*FACTOR1. 凡例: A = 1 OBS, B = 2 OBS, ... FACTOR3 | | 2.5 + | | A B |A BA A | A BABA A C ABBAA AAA AB A A A A 0.0 +-------------A-----BA-A-AAA---A--A-A-+-BAAC-AABAA-A--AC-AA----- | A AA A A A B B| B A ABBB A AA | A A |A AA A -2.5 + A A | A | A | | | -5.0 + | --+-----------+-----------+-----------+-----------+-----------+- -3 -2 -1 0 1 2 FACTOR1
/* Lesson 15-4 */ /* File Name = fact04.sas 10/16/97 */ data hobby; infile 'syumi.dat'; input code $ X1-X6; label X1='M(-29)' X2='M(30-49)' X3='M(50-)' X4='F(-29)' X5='F(30-49)' X6='F(50-)'; proc print data=hobby(obs=10); run; proc factor data=hobby nfactor=2 out=fscore; var X1-X6; run; proc plot data=fscore; : 回転前 plot factor1*factor2=code/vref=0.0 href=0.0; : コード化した記号 run; : proc factor data=hobby nfactor=2 rotate=varimax out=fscore2; var X1-X6; run; proc plot data=fscore2; : 回転後 plot factor1*factor2=code/vref=0.0 href=0.0; : コード化した記号 run; :
SAS システム 28 12:10 Thursday, October 16, 1997 OBS CODE X1 X2 X3 X4 X5 X6 1 A 4.00 4.25 3.83 4.50 4.67 4.00 2 B 4.17 3.89 4.00 4.50 4.17 3.75 3 C 3.83 3.44 2.83 3.57 3.17 1.50 4 D 2.83 4.22 3.83 3.71 3.00 2.25 5 E 4.17 4.11 3.83 3.57 4.00 3.75 6 F 2.33 3.56 3.33 2.93 2.83 2.75 7 G 1.83 2.44 2.33 3.71 3.83 3.75 8 H 2.50 1.89 2.00 4.21 3.17 3.75 9 I 2.00 1.44 2.00 4.07 3.33 3.50 10 J 4.00 3.33 3.33 3.00 3.17 2.25 SAS システム 29 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Prior Communality Estimates: ONE Eigenvalues of the Correlation Matrix: Total = 6 Average = 1 1 2 3 Eigenvalue 2.7435 1.7477 0.7451 Difference 0.9958 1.0027 0.3571 Proportion 0.4573 0.2913 0.1242 Cumulative 0.4573 0.7485 0.8727 4 5 6 Eigenvalue 0.3879 0.2263 0.1495 Difference 0.1616 0.0768 Proportion 0.0647 0.0377 0.0249 Cumulative 0.9374 0.9751 1.0000 SAS システム 30 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components 2 factors will be retained by the NFACTOR criterion. Factor Pattern FACTOR1 FACTOR2 X1 0.52708 0.63297 M(-29) X2 0.59628 0.64623 M(30-49) X3 0.64192 0.47370 M(50-) X4 0.82757 -0.35514 F(-29) X5 0.79607 -0.43033 F(30-49) X6 0.61604 -0.62750 F(50-) SAS システム 31 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Variance explained by each factor FACTOR1 FACTOR2 2.743514 1.747721 Final Communality Estimates: Total = 4.491236 X1 X2 X3 X4 X5 X6 0.678467 0.773166 0.636447 0.810993 0.818906 0.773257 SAS システム 32 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Scoring Coefficients Estimated by Regression Squared Multiple Correlations of the Variables with each Factor FACTOR1 FACTOR2 1.000000 1.000000 SAS システム 33 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Standardized Scoring Coefficients FACTOR1 FACTOR2 X1 0.19212 0.36217 M(-29) X2 0.21734 0.36976 M(30-49) X3 0.23398 0.27104 M(50-) X4 0.30164 -0.20320 F(-29) X5 0.29016 -0.24622 F(30-49) X6 0.22454 -0.35904 F(50-) SAS システム 34 12:10 Thursday, October 16, 1997 プロット : FACTOR1*FACTOR2. 使用するプロット文字: CODE の値. (NOTE: 1 オブザベーションを表示してません.) 2 + A B | | Z E FACTOR1 | R | | | | 3 Q M | DL O 0 +--------------HG------------S-----2--+--F-------C-------------- | I K P | V N | | U W | 1|Y | T X -2 + 4 | --+-----------+-----------+-----------+-----------+-----------+- -3 -2 -1 0 1 2 FACTOR2 SAS システム 35 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Prior Communality Estimates: ONE Eigenvalues of the Correlation Matrix: Total = 6 Average = 1 1 2 3 Eigenvalue 2.7435 1.7477 0.7451 Difference 0.9958 1.0027 0.3571 Proportion 0.4573 0.2913 0.1242 Cumulative 0.4573 0.7485 0.8727 4 5 6 Eigenvalue 0.3879 0.2263 0.1495 Difference 0.1616 0.0768 Proportion 0.0647 0.0377 0.0249 Cumulative 0.9374 0.9751 1.0000 SAS システム 36 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components 2 factors will be retained by the NFACTOR criterion. Factor Pattern FACTOR1 FACTOR2 X1 0.52708 0.63297 M(-29) X2 0.59628 0.64623 M(30-49) X3 0.64192 0.47370 M(50-) X4 0.82757 -0.35514 F(-29) X5 0.79607 -0.43033 F(30-49) X6 0.61604 -0.62750 F(50-) SAS システム 37 12:10 Thursday, October 16, 1997 Initial Factor Method: Principal Components Variance explained by each factor FACTOR1 FACTOR2 2.743514 1.747721 Final Communality Estimates: Total = 4.491236 X1 X2 X3 X4 X5 X6 0.678467 0.773166 0.636447 0.810993 0.818906 0.773257 SAS システム 38 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Orthogonal Transformation Matrix 1 2 1 0.77751 0.62886 2 -0.62886 0.77751 SAS システム 39 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Rotated Factor Pattern FACTOR1 FACTOR2 X1 0.01176 0.82361 M(-29) X2 0.05723 0.87743 M(30-49) X3 0.20121 0.77199 M(50-) X4 0.86678 0.24430 F(-29) X5 0.88957 0.16603 F(30-49) X6 0.87359 -0.10049 F(50-) Variance explained by each factor FACTOR1 FACTOR2 2.349707 2.141529 SAS システム 40 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Final Communality Estimates: Total = 4.491236 X1 X2 X3 X4 X5 X6 0.678467 0.773166 0.636447 0.810993 0.818906 0.773257 Scoring Coefficients Estimated by Regression Squared Multiple Correlations of the Variables with each Factor FACTOR1 FACTOR2 1.000000 1.000000 SAS システム 41 12:10 Thursday, October 16, 1997 Rotation Method: Varimax Standardized Scoring Coefficients FACTOR1 FACTOR2 X1 -0.07838 0.40241 M(-29) X2 -0.06354 0.42417 M(30-49) X3 0.01147 0.35788 M(50-) X4 0.36232 0.03170 F(-29) X5 0.38045 -0.00897 F(30-49) X6 0.40037 -0.13795 F(50-) SAS システム 42 12:10 Thursday, October 16, 1997 プロット : FACTOR1*FACTOR2. 使用するプロット文字: CODE の値. 2 + | | | A FACTOR1 | I H G 3 | R ZB | Q | E | K S |M 0 +---------------------P-2--+------------D------------- | |F CJ L O | Y | V N | 4 1 T | U | X | W -2 + | ---+-----------+-----------+-----------+-----------+-- -2 -1 0 1 2 FACTOR2