いくつか(p個)の変量の値をできるだけ情報の損失なしに、
少数変量(m個、m>p)の総合的指標(主成分)で代表させる方法である。
いくつかのテストの成績を総合した総合的成績、
いろいろな症状を総合した総合的な重症度、
種々の財務指標に基づく企業の評価...
を求めたいといった場合に用いられる。
p変量(p次元)の観測値をm個(m次元)の主成分に縮約させるという意味で、
次元を減少させる方法と言うこともでき、
多変量データを要約する一つの有力な方法である。
/* Lesson 14-1 */ /* File Name = prin01.sas 10/09/97 */ data tyuuni; infile 'seiseki.dat'; input koku sya suu rika eigo; proc print data=tyuuni; run; proc plot data=tyuuni; : 散布図 plot suu*eigo; : 元の変量のプロット run; : proc princomp cov data=tyuuni out=out_prin; : 主成分分析(分散共分散行列) var suu eigo; : 2変量 run; : proc print data=out_prin; : 結果の出力 run; : proc plot data=out_prin; : 散布図 plot prin2*prin1/vref=0 href=0; : 主成分得点のプロット run; : : proc sort data=out_prin; : 説明のためにソートする by prin1; : proc print data=out_prin; : 数学が効いていることの確認 run; :
SAS システム 260 16:28 Wednesday, October 8, 1997 OBS KOKU SYA SUU RIKA EIGO 1 29 33 55 79 84 2 71 68 72 64 97 3 74 91 79 76 100 4 52 56 58 60 85 5 77 92 96 88 98 6 60 85 66 66 88 7 81 91 73 63 95 8 61 84 72 78 92 9 70 75 81 67 96 10 53 70 73 51 92 11 69 64 96 57 97 12 87 89 90 85 100 13 83 75 96 81 98 14 76 61 67 57 86 15 87 82 78 82 97 16 77 80 78 70 94 17 38 43 45 12 96 18 67 73 78 67 95 19 83 77 80 67 100 20 47 61 56 21 95 21 70 62 88 51 96 22 81 51 63 66 92 23 51 16 36 48 84 SAS システム 262 16:28 Wednesday, October 8, 1997 プロット : SUU*EIGO. 凡例: A = 1 OBS, B = 2 OBS, ... 100 + A B | A A SUU | A A A A B | A A B A A | A A A A 50 + A | A | | | 0 + --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+- 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 EIGO SAS システム 263 16:28 Wednesday, October 8, 1997 Principal Component Analysis 23 Observations 2 Variables Simple Statistics SUU EIGO Mean 72.86956522 93.78260870 StD 15.81513759 5.10753918 Covariance Matrix SUU EIGO SUU 250.1185771 55.4249012 EIGO 55.4249012 26.0869565 Total Variance = 276.2055336 Eigenvalues of the Covariance Matrix Eigenvalue Difference Proportion Cumulative PRIN1 263.081 249.956 0.952481 0.95248 PRIN2 13.125 . 0.047519 1.00000 Eigenvectors PRIN1 PRIN2 SUU 0.973726 -.227722 EIGO 0.227722 0.973726 SAS システム 266 16:28 Wednesday, October 8, 1997 OBS KOKU SYA SUU RIKA EIGO PRIN1 PRIN2 1 29 33 55 79 84 -19.6278 -5.45629 2 71 68 72 64 97 -0.1140 3.33088 3 74 91 79 76 100 7.3852 4.65800 4 52 56 58 60 85 -16.4789 -5.16573 5 77 92 96 88 98 23.4831 -1.16073 6 60 85 66 66 88 -8.0059 -4.06633 7 81 91 73 63 95 0.4042 1.15570 8 61 84 72 78 92 -1.2527 -1.53775 9 70 75 81 67 96 8.4218 0.30765 10 53 70 73 51 92 -0.2789 -1.76548 11 69 64 96 57 97 23.2554 -2.13445 12 87 89 90 85 100 18.0962 2.15306 13 83 75 96 81 98 23.4831 -1.16073 14 76 61 67 57 86 -7.4876 -6.24150 15 87 82 78 82 97 5.7283 1.96455 16 77 80 78 70 94 5.0451 -0.95663 17 38 43 45 12 96 -26.6324 8.50565 18 67 73 78 67 95 5.2729 0.01709 19 83 77 80 67 100 8.3589 4.43028 20 47 61 56 21 95 -16.1491 5.02698 21 70 62 88 51 96 15.2378 -1.28640 22 81 51 63 66 92 -10.0162 0.51174 23 51 16 36 48 84 -38.1286 -1.12957 SAS システム 268 16:28 Wednesday, October 8, 1997 プロット : PRIN2*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ... 10 + | | A | PRIN2 | A | | A B | A A A 0 +---------------------A------+--B-A----------------------- | A AA A C | A | | A A A | | | -10 + | ---+------------+------------+------------+------------+-- -40 -20 0 20 40 PRIN1 SAS システム 269 16:28 Wednesday, October 8, 1997 OBS KOKU SYA SUU RIKA EIGO PRIN1 PRIN2 1 51 16 36 48 84 -38.1286 -1.12957 2 38 43 45 12 96 -26.6324 8.50565 3 29 33 55 79 84 -19.6278 -5.45629 4 52 56 58 60 85 -16.4789 -5.16573 5 47 61 56 21 95 -16.1491 5.02698 6 81 51 63 66 92 -10.0162 0.51174 7 60 85 66 66 88 -8.0059 -4.06633 8 76 61 67 57 86 -7.4876 -6.24150 9 61 84 72 78 92 -1.2527 -1.53775 10 53 70 73 51 92 -0.2789 -1.76548 11 71 68 72 64 97 -0.1140 3.33088 12 81 91 73 63 95 0.4042 1.15570 13 77 80 78 70 94 5.0451 -0.95663 14 67 73 78 67 95 5.2729 0.01709 15 87 82 78 82 97 5.7283 1.96455 16 74 91 79 76 100 7.3852 4.65800 17 83 77 80 67 100 8.3589 4.43028 18 70 75 81 67 96 8.4218 0.30765 19 70 62 88 51 96 15.2378 -1.28640 20 87 89 90 85 100 18.0962 2.15306 21 69 64 96 57 97 23.2554 -2.13445 22 77 92 96 88 98 23.4831 -1.16073 23 83 75 96 81 98 23.4831 -1.16073
/* Lesson 14-2 */ /* File Name = prin02.sas 10/09/97 */ data tyuuni; infile 'seiseki.dat'; input koku sya suu rika eigo; proc print data=tyuuni; run; proc princomp cov data=tyuuni out=out_prin; var koku sya suu rika eigo; run; proc print data=out_prin; run; proc plot data=out_prin; plot prin2*prin1/vref=0 href=0; plot prin3*prin2/vref=0 href=0; plot prin3*prin1/vref=0 href=0; run;
SAS システム 82 16:28 Wednesday, October 8, 1997 Principal Component Analysis 23 Observations 5 Variables Simple Statistics KOKU SYA SUU Mean 67.13043478 68.65217391 72.86956522 StD 15.83811383 19.41791276 15.81513759 RIKA EIGO Mean 63.30434783 93.78260870 StD 18.48447698 5.10753918 Covariance Matrix KOKU SYA SUU KOKU 250.8458498 202.5019763 175.7905138 SYA 202.5019763 377.0553360 224.1798419 SUU 175.7905138 224.1798419 250.1185771 RIKA 163.2312253 186.1561265 170.7687747 EIGO 47.5296443 62.6482213 55.4249012 RIKA EIGO KOKU 163.2312253 47.5296443 SYA 186.1561265 62.6482213 SUU 170.7687747 55.4249012 RIKA 341.6758893 14.9328063 EIGO 14.9328063 26.0869565 Total Variance = 1245.7826087 Eigenvalues of the Covariance Matrix Eigenvalue Difference Proportion Cumulative PRIN1 883.441 699.222 0.709146 0.70915 PRIN2 184.219 84.728 0.147874 0.85702 PRIN3 99.492 29.421 0.079863 0.93688 PRIN4 70.070 61.511 0.056246 0.99313 PRIN5 8.560 . 0.006871 1.00000 Eigenvectors PRIN1 PRIN2 PRIN3 PRIN4 PRIN5 KOKU 0.448104 0.101261 0.753392 -.463211 -.082373 SYA 0.577795 0.471232 -.594584 -.292558 -.070520 SUU 0.468725 0.149432 0.189245 0.827949 -.191451 RIKA 0.484210 -.842180 -.189418 -.047703 0.134607 EIGO 0.105797 0.189973 0.084724 0.109864 0.966162 SAS システム 88 16:28 Wednesday, October 8, 1997 OBS KOKU SYA SUU RIKA EIGO PRIN1 PRIN2 PRIN3 PRIN4 PRIN5 1 29 33 55 79 84 -39.4970 -38.4088 -14.7125 11.4742 1.73740 2 71 68 72 64 97 1.6268 -0.0201 3.2793 -2.0013 3.09588 3 74 91 79 76 100 25.6694 2.6318 -8.8300 -4.5670 4.40043 4 52 56 58 60 85 -23.5893 -8.6018 -6.8086 -2.4084 -3.94486 5 77 92 96 88 98 41.1587 -4.5389 -6.3898 7.0338 0.51109 6 60 85 66 66 88 3.7241 2.5863 -17.3927 -7.9313 -4.47438 7 81 91 73 63 95 19.1700 12.4425 -2.6530 -12.7063 -1.60818 8 61 84 72 78 92 12.6404 -6.2334 -16.8434 -3.2673 -0.15501 9 70 75 81 67 96 10.7886 1.8057 -0.5859 3.6126 0.39922 10 53 70 73 51 92 -11.6385 9.2476 -9.2428 6.6502 -2.33460 11 69 64 96 57 97 6.2793 7.3741 10.0187 20.3001 -1.99437 12 87 89 90 85 100 39.8530 -2.9301 2.5302 -1.3255 2.57613 13 83 75 96 81 98 30.6354 -6.0471 9.5644 9.5620 0.27343 14 76 61 67 57 86 -7.0741 0.2460 10.6561 -7.2838 -7.43512 15 87 82 78 82 97 28.4137 -6.0653 4.7354 -9.3994 2.06487 16 77 80 78 70 94 16.6492 1.5159 0.4095 -3.9394 -1.48414 17 38 43 45 12 96 -65.5458 24.4263 -2.0626 0.6147 4.78062 18 67 73 78 67 95 6.7767 -0.0789 -2.3094 2.9936 0.39557 19 83 77 80 67 100 17.7240 4.6750 8.1687 -3.3828 3.24343 20 47 61 56 21 95 -41.7045 27.6939 -5.6924 -0.2520 0.90926 21 70 62 88 51 96 -1.1890 10.2005 11.4991 13.9747 -2.17790 22 81 51 63 66 92 -7.4938 -10.9975 18.4155 -9.7562 0.63245 23 51 16 36 48 84 -63.3775 -20.9237 14.2463 -7.9952 0.58878 SAS システム 90 16:28 Wednesday, October 8, 1997 プロット : PRIN2*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ... 50 + | | | PRIN2 | A | | A | | A A| A A 0 +---------------------------------A--+AAA-A-AA---A-----AA--------- | A A | A AA | A | | | | A | -50 + | -+--------+--------+--------+--------+--------+--------+--------+- -80 -60 -40 -20 0 20 40 60 PRIN1 SAS システム 91 16:28 Wednesday, October 8, 1997 プロット : PRIN3*PRIN2. 凡例: A = 1 OBS, B = 2 OBS, ... 20 + A | | A | PRIN3 | A A A | A | A | A A A 0 +----------------------------+B--------------------------- | A A A A | A A | A A | | | A A | A -20 + | ---+------------+------------+------------+------------+-- -40 -20 0 20 40 PRIN2 SAS システム 92 16:28 Wednesday, October 8, 1997 プロット : PRIN3*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ... 20 + A | | A | PRIN3 | A A| A | | A A | |A A A 0 +------------------------------------+----A-A--------------------- | A A | A A | A A | A A | | | A | A A -20 + | -+--------+--------+--------+--------+--------+--------+--------+- -80 -60 -40 -20 0 20 40 60 PRIN1
/* Lesson 14-3 */ /* File Name = prin03.sas 10/09/97 */ data tyuuni; infile 'seiseki.dat'; input koku sya suu rika eigo; proc print data=tyuuni; run; : proc princomp data=tyuuni out=out_prin; : 相関係数を使って var koku sya suu rika eigo; : run; : proc print data=out_prin; run; proc plot data=out_prin; plot prin2*prin1/vref=0 href=0; plot prin3*prin2/vref=0 href=0; plot prin3*prin1/vref=0 href=0; run;
SAS システム 95 16:28 Wednesday, October 8, 1997 Principal Component Analysis 23 Observations 5 Variables Simple Statistics KOKU SYA SUU Mean 67.13043478 68.65217391 72.86956522 StD 15.83811383 19.41791276 15.81513759 RIKA EIGO Mean 63.30434783 93.78260870 StD 18.48447698 5.10753918 Correlation Matrix KOKU SYA SUU RIKA EIGO KOKU 1.0000 0.6585 0.7018 0.5576 0.5876 SYA 0.6585 1.0000 0.7300 0.5186 0.6317 SUU 0.7018 0.7300 1.0000 0.5842 0.6862 RIKA 0.5576 0.5186 0.5842 1.0000 0.1582 EIGO 0.5876 0.6317 0.6862 0.1582 1.0000 Eigenvalues of the Correlation Matrix Eigenvalue Difference Proportion Cumulative PRIN1 3.36123 2.51380 0.672246 0.67225 PRIN2 0.84743 0.50592 0.169486 0.84173 PRIN3 0.34151 0.05968 0.068302 0.91003 PRIN4 0.28183 0.11383 0.056366 0.96640 PRIN5 0.16800 . 0.033600 1.00000 Eigenvectors PRIN1 PRIN2 PRIN3 PRIN4 PRIN5 KOKU 0.469962 0.057099 -.801071 -.340151 -.135862 SYA 0.475925 -.059026 0.569510 -.663253 -.075929 SUU 0.497565 -.028416 0.156983 0.567233 -.636572 RIKA 0.366295 0.763815 0.094762 0.222034 0.473430 EIGO 0.413386 -.639559 -.017882 0.270815 0.588571 SAS システム 100 16:28 Wednesday, October 8, 1997 OBS KOKU SYA SUU RIKA EIGO PRIN1 PRIN2 PRIN3 PRIN4 PRIN5 1 29 33 55 79 84 -3.04820 1.87656 0.82028 1.06560 0.46046 2 71 68 72 64 97 0.34567 -0.35664 -0.23117 0.08693 0.39293 3 74 91 79 76 100 1.69924 -0.30811 0.41216 -0.20883 0.64856 4 52 56 58 60 85 -2.00319 0.97383 0.26041 -0.28158 -0.31893 5 77 92 96 88 98 2.42354 0.41543 0.52702 0.34041 0.01153 6 60 85 66 66 88 -0.44163 0.77242 0.80599 -0.92587 -0.32357 7 81 91 73 63 95 1.05589 -0.18318 -0.05059 -0.99563 -0.07912 8 61 84 72 78 92 0.31384 0.76328 0.83316 -0.34175 0.19854 9 70 75 81 67 96 0.74923 -0.14851 0.13292 0.17512 -0.02652 10 53 70 73 51 92 -0.77026 -0.34050 0.69869 0.01980 -0.40987 11 69 64 96 57 97 0.80464 -0.68406 -0.04499 1.04322 -0.71958 12 87 89 90 85 100 2.56039 0.09697 -0.14870 0.08293 0.33262 13 83 75 96 81 98 2.04620 0.19948 -0.31094 0.70814 -0.15275 14 76 61 67 57 86 -0.86386 0.77980 -0.73638 -0.62802 -0.86821 15 87 82 78 82 97 1.70903 0.39150 -0.47799 -0.30348 0.42046 16 77 80 78 70 94 0.88268 0.24132 -0.08188 -0.32361 -0.13900 17 38 43 45 12 96 -3.20712 -2.37462 0.17361 0.00355 0.41347 18 67 73 78 67 95 0.43587 -0.02263 0.19972 0.14725 0.01255 19 83 77 80 67 100 1.47628 -0.60680 -0.48987 0.00384 0.35534 20 47 61 56 21 95 -2.05540 -1.91954 0.40515 -0.35495 -0.06161 21 70 62 88 51 96 0.33377 -0.78272 -0.26090 0.67804 -0.66724 22 81 51 63 66 92 -0.42247 0.45600 -1.29713 -0.11106 0.21093 23 51 16 36 48 84 -4.02413 0.76070 -1.13857 0.11994 0.30900 SAS システム 102 16:28 Wednesday, October 8, 1997 プロット : PRIN2*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ... PRIN2 | | 2 + A | | | | A A A B | A A A 0 +----------------------------+--A-AAA---A-A---A----------- | A | B A A | | -2 + A | | A | | | -4 + | ---+------------+------------+------------+------------+-- -4 -2 0 2 4 PRIN1 SAS システム 103 16:28 Wednesday, October 8, 1997 プロット : PRIN3*PRIN2. 凡例: A = 1 OBS, B = 2 OBS, ... PRIN3 | | 1 + | | A | A B A | A A A A A 0 +------------------------------A-----B-+A-A----------------------- | A A A | A A | | A -1 + | A | | A | | -2 + | ---+-----------+-----------+-----------+-----------+-----------+-- -3 -2 -1 0 1 2 PRIN2 SAS システム 104 16:28 Wednesday, October 8, 1997 プロット : PRIN3*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ... PRIN3 | | 1 + | | A A A | A A | A B | A A 0 +----------------------------+----BAA---------A----------- | | B AA A | A | -1 + A | | A | | | -2 + | ---+------------+------------+------------+------------+-- -4 -2 0 2 4 PRIN1
/* Lesson 14-4 */ /* File Name = prin04.sas 10/09/97 */ data gakusei; infile 'all.dat'; input seibetsu $ shintyou taijyuu kyoui shiokuri $ kodukai; proc print data=gakusei(obs=10); run; proc princomp data=gakusei out=out_prin; var shintyou taijyuu kyoui kodukai; run; proc print data=out_prin(obs=10); run; proc plot data=out_prin; plot prin2*prin1/vref=0 href=0; plot prin3*prin2/vref=0 href=0; plot prin3*prin1/vref=0 href=0; run;
SAS システム 228 16:28 Wednesday, October 8, 1997 OBS SEIBETSU SHINTYOU TAIJYUU KYOUI SHIOKURI KODUKAI 1 F 148.9 . . J 60000 2 F 156.0 . . G . 3 M 156.0 61 90 J 0 4 F 156.5 . . J 20000 5 F 157.0 43 . J 20000 6 F 160.0 . . J 43500 7 F 161.0 . . J 25000 8 F 161.0 . . . 9 F 162.0 . . J 0 10 F 162.0 . . J 30000 SAS システム 229 16:28 Wednesday, October 8, 1997 Principal Component Analysis 42 Observations 4 Variables Simple Statistics SHINTYOU TAIJYUU KYOUI KODUKAI Mean 163.9214286 56.38095238 85.95238095 53833.33333 StD 8.7977696 9.76292734 6.99111764 49036.03288 Correlation Matrix SHINTYOU TAIJYUU KYOUI KODUKAI SHINTYOU 1.0000 0.7962 0.5647 0.2157 TAIJYUU 0.7962 1.0000 0.7754 0.1919 KYOUI 0.5647 0.7754 1.0000 -.0643 KODUKAI 0.2157 0.1919 -.0643 1.0000 Eigenvalues of the Correlation Matrix Eigenvalue Difference Proportion Cumulative PRIN1 2.45777 1.41479 0.614443 0.61444 PRIN2 1.04298 0.67066 0.260744 0.87519 PRIN3 0.37232 0.24539 0.093081 0.96827 PRIN4 0.12693 . 0.031732 1.00000 Eigenvectors PRIN1 PRIN2 PRIN3 PRIN4 SHINTYOU 0.563145 0.098142 -.713891 0.404469 TAIJYUU 0.611881 -.012170 0.033325 -.790154 KYOUI 0.537423 -.333502 0.631891 0.447958 KODUKAI 0.140164 0.937548 0.299937 0.106750 SAS システム 233 16:28 Wednesday, October 8, 1997 S S S E H T H K I I A I O B N I K O D P P P P E T J Y K U R R R R O T Y Y O U K I I I I B S O U U R A N N N N S U U U I I I 1 2 3 4 1 F 148.9 . . J 60000 . . . . 2 F 156.0 . . G . . . . . 3 M 156.0 61 90 J 0 -0.060283 -1.31648 0.69511 -0.59586 4 F 156.5 . . J 20000 . . . . 5 F 157.0 43 . J 20000 . . . . 6 F 160.0 . . J 43500 . . . . 7 F 161.0 . . J 25000 . . . . 8 F 161.0 . . . . . . . 9 F 162.0 . . J 0 . . . . 10 F 162.0 . . J 30000 . . . . SAS システム 235 16:28 Wednesday, October 8, 1997 プロット : PRIN2*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ... (NOTE: 232 オブザベーションが欠損値です.) 2 + | A | | A A A PRIN2 | A A A A | A A | AA A A | A | | 0 +---------------A--A-------+A----------------A-------------------- | A A B | A A | A BA AA A | A B A | A A A A A | | A -2 + | ---+-----------+-----------+-----------+-----------+-----------+-- -4 -2 0 2 4 6 PRIN1 SAS システム 236 16:28 Wednesday, October 8, 1997 プロット : PRIN3*PRIN2. 凡例: A = 1 OBS, B = 2 OBS, ... (NOTE: 232 オブザベーションが欠損値です.) PRIN3 | | 1 + A A A | A | A A A| A A | A |AA A B A A A 0 +-------------B---A--A-----+-A------------------------------------ | A CAABA | B A | A AA | A A -1 + | A | | | A | -2 + | ---+-----------+-----------+-----------+-----------+-----------+-- -2 -1 0 1 2 3 PRIN2 SAS システム 237 16:28 Wednesday, October 8, 1997 プロット : PRIN3*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ... (NOTE: 232 オブザベーションが欠損値です.) PRIN3 | | 1 + A A | A A | A A A A A | A AAA |A A A AA 0 +---------------B--A-------+-----A--A----------------------------- | A BA CA | AA B | A A A | A A -1 + | A | | | | A -2 + | ---+-----------+-----------+-----------+-----------+-----------+-- -4 -2 0 2 4 6 PRIN1
明確に決まっているわけではないが、以下のような基準が一般的に 用いられている。また、結果の解釈の都合上、多少増減させることもある。