いくつか(p個)の変量の値をできるだけ情報の損失なしに、
少数変量(m個、m>p)の総合的指標(主成分)で代表させる方法である。
いくつかのテストの成績を総合した総合的成績、
いろいろな症状を総合した総合的な重症度、
種々の財務指標に基づく企業の評価...
を求めたいといった場合に用いられる。
p変量(p次元)の観測値をm個(m次元)の主成分に縮約させるという意味で、
次元を減少させる方法と言うこともでき、
多変量データを要約する一つの有力な方法である。
/* Lesson 14-1 */
/* File Name = prin01.sas 10/09/97 */
data tyuuni;
infile 'seiseki.dat';
input koku sya suu rika eigo;
proc print data=tyuuni;
run;
proc plot data=tyuuni; : 散布図
plot suu*eigo; : 元の変量のプロット
run; :
proc princomp cov data=tyuuni out=out_prin; : 主成分分析(分散共分散行列)
var suu eigo; : 2変量
run; :
proc print data=out_prin; : 結果の出力
run; :
proc plot data=out_prin; : 散布図
plot prin2*prin1/vref=0 href=0; : 主成分得点のプロット
run; :
:
proc sort data=out_prin; : 説明のためにソートする
by prin1; :
proc print data=out_prin; : 数学が効いていることの確認
run; :
SAS システム 260
16:28 Wednesday, October 8, 1997
OBS KOKU SYA SUU RIKA EIGO
1 29 33 55 79 84
2 71 68 72 64 97
3 74 91 79 76 100
4 52 56 58 60 85
5 77 92 96 88 98
6 60 85 66 66 88
7 81 91 73 63 95
8 61 84 72 78 92
9 70 75 81 67 96
10 53 70 73 51 92
11 69 64 96 57 97
12 87 89 90 85 100
13 83 75 96 81 98
14 76 61 67 57 86
15 87 82 78 82 97
16 77 80 78 70 94
17 38 43 45 12 96
18 67 73 78 67 95
19 83 77 80 67 100
20 47 61 56 21 95
21 70 62 88 51 96
22 81 51 63 66 92
23 51 16 36 48 84
SAS システム 262
16:28 Wednesday, October 8, 1997
プロット : SUU*EIGO. 凡例: A = 1 OBS, B = 2 OBS, ...
100 + A B
| A A
SUU | A A A A B
| A A B A A
| A A A A
50 + A
| A
|
|
|
0 +
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
EIGO
SAS システム 263
16:28 Wednesday, October 8, 1997
Principal Component Analysis
23 Observations
2 Variables
Simple Statistics
SUU EIGO
Mean 72.86956522 93.78260870
StD 15.81513759 5.10753918
Covariance Matrix
SUU EIGO
SUU 250.1185771 55.4249012
EIGO 55.4249012 26.0869565
Total Variance = 276.2055336
Eigenvalues of the Covariance Matrix
Eigenvalue Difference Proportion Cumulative
PRIN1 263.081 249.956 0.952481 0.95248
PRIN2 13.125 . 0.047519 1.00000
Eigenvectors
PRIN1 PRIN2
SUU 0.973726 -.227722
EIGO 0.227722 0.973726
SAS システム 266
16:28 Wednesday, October 8, 1997
OBS KOKU SYA SUU RIKA EIGO PRIN1 PRIN2
1 29 33 55 79 84 -19.6278 -5.45629
2 71 68 72 64 97 -0.1140 3.33088
3 74 91 79 76 100 7.3852 4.65800
4 52 56 58 60 85 -16.4789 -5.16573
5 77 92 96 88 98 23.4831 -1.16073
6 60 85 66 66 88 -8.0059 -4.06633
7 81 91 73 63 95 0.4042 1.15570
8 61 84 72 78 92 -1.2527 -1.53775
9 70 75 81 67 96 8.4218 0.30765
10 53 70 73 51 92 -0.2789 -1.76548
11 69 64 96 57 97 23.2554 -2.13445
12 87 89 90 85 100 18.0962 2.15306
13 83 75 96 81 98 23.4831 -1.16073
14 76 61 67 57 86 -7.4876 -6.24150
15 87 82 78 82 97 5.7283 1.96455
16 77 80 78 70 94 5.0451 -0.95663
17 38 43 45 12 96 -26.6324 8.50565
18 67 73 78 67 95 5.2729 0.01709
19 83 77 80 67 100 8.3589 4.43028
20 47 61 56 21 95 -16.1491 5.02698
21 70 62 88 51 96 15.2378 -1.28640
22 81 51 63 66 92 -10.0162 0.51174
23 51 16 36 48 84 -38.1286 -1.12957
SAS システム 268
16:28 Wednesday, October 8, 1997
プロット : PRIN2*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ...
10 + |
| A |
PRIN2 | A |
| A B
| A A A
0 +---------------------A------+--B-A-----------------------
| A AA A C
| A |
| A A A |
| |
-10 + |
---+------------+------------+------------+------------+--
-40 -20 0 20 40
PRIN1
SAS システム 269
16:28 Wednesday, October 8, 1997
OBS KOKU SYA SUU RIKA EIGO PRIN1 PRIN2
1 51 16 36 48 84 -38.1286 -1.12957
2 38 43 45 12 96 -26.6324 8.50565
3 29 33 55 79 84 -19.6278 -5.45629
4 52 56 58 60 85 -16.4789 -5.16573
5 47 61 56 21 95 -16.1491 5.02698
6 81 51 63 66 92 -10.0162 0.51174
7 60 85 66 66 88 -8.0059 -4.06633
8 76 61 67 57 86 -7.4876 -6.24150
9 61 84 72 78 92 -1.2527 -1.53775
10 53 70 73 51 92 -0.2789 -1.76548
11 71 68 72 64 97 -0.1140 3.33088
12 81 91 73 63 95 0.4042 1.15570
13 77 80 78 70 94 5.0451 -0.95663
14 67 73 78 67 95 5.2729 0.01709
15 87 82 78 82 97 5.7283 1.96455
16 74 91 79 76 100 7.3852 4.65800
17 83 77 80 67 100 8.3589 4.43028
18 70 75 81 67 96 8.4218 0.30765
19 70 62 88 51 96 15.2378 -1.28640
20 87 89 90 85 100 18.0962 2.15306
21 69 64 96 57 97 23.2554 -2.13445
22 77 92 96 88 98 23.4831 -1.16073
23 83 75 96 81 98 23.4831 -1.16073
/* Lesson 14-2 */ /* File Name = prin02.sas 10/09/97 */ data tyuuni; infile 'seiseki.dat'; input koku sya suu rika eigo; proc print data=tyuuni; run; proc princomp cov data=tyuuni out=out_prin; var koku sya suu rika eigo; run; proc print data=out_prin; run; proc plot data=out_prin; plot prin2*prin1/vref=0 href=0; plot prin3*prin2/vref=0 href=0; plot prin3*prin1/vref=0 href=0; run;
SAS システム 82
16:28 Wednesday, October 8, 1997
Principal Component Analysis
23 Observations
5 Variables
Simple Statistics
KOKU SYA SUU
Mean 67.13043478 68.65217391 72.86956522
StD 15.83811383 19.41791276 15.81513759
RIKA EIGO
Mean 63.30434783 93.78260870
StD 18.48447698 5.10753918
Covariance Matrix
KOKU SYA SUU
KOKU 250.8458498 202.5019763 175.7905138
SYA 202.5019763 377.0553360 224.1798419
SUU 175.7905138 224.1798419 250.1185771
RIKA 163.2312253 186.1561265 170.7687747
EIGO 47.5296443 62.6482213 55.4249012
RIKA EIGO
KOKU 163.2312253 47.5296443
SYA 186.1561265 62.6482213
SUU 170.7687747 55.4249012
RIKA 341.6758893 14.9328063
EIGO 14.9328063 26.0869565
Total Variance = 1245.7826087
Eigenvalues of the Covariance Matrix
Eigenvalue Difference Proportion Cumulative
PRIN1 883.441 699.222 0.709146 0.70915
PRIN2 184.219 84.728 0.147874 0.85702
PRIN3 99.492 29.421 0.079863 0.93688
PRIN4 70.070 61.511 0.056246 0.99313
PRIN5 8.560 . 0.006871 1.00000
Eigenvectors
PRIN1 PRIN2 PRIN3 PRIN4 PRIN5
KOKU 0.448104 0.101261 0.753392 -.463211 -.082373
SYA 0.577795 0.471232 -.594584 -.292558 -.070520
SUU 0.468725 0.149432 0.189245 0.827949 -.191451
RIKA 0.484210 -.842180 -.189418 -.047703 0.134607
EIGO 0.105797 0.189973 0.084724 0.109864 0.966162
SAS システム 88
16:28 Wednesday, October 8, 1997
OBS KOKU SYA SUU RIKA EIGO PRIN1 PRIN2 PRIN3 PRIN4 PRIN5
1 29 33 55 79 84 -39.4970 -38.4088 -14.7125 11.4742 1.73740
2 71 68 72 64 97 1.6268 -0.0201 3.2793 -2.0013 3.09588
3 74 91 79 76 100 25.6694 2.6318 -8.8300 -4.5670 4.40043
4 52 56 58 60 85 -23.5893 -8.6018 -6.8086 -2.4084 -3.94486
5 77 92 96 88 98 41.1587 -4.5389 -6.3898 7.0338 0.51109
6 60 85 66 66 88 3.7241 2.5863 -17.3927 -7.9313 -4.47438
7 81 91 73 63 95 19.1700 12.4425 -2.6530 -12.7063 -1.60818
8 61 84 72 78 92 12.6404 -6.2334 -16.8434 -3.2673 -0.15501
9 70 75 81 67 96 10.7886 1.8057 -0.5859 3.6126 0.39922
10 53 70 73 51 92 -11.6385 9.2476 -9.2428 6.6502 -2.33460
11 69 64 96 57 97 6.2793 7.3741 10.0187 20.3001 -1.99437
12 87 89 90 85 100 39.8530 -2.9301 2.5302 -1.3255 2.57613
13 83 75 96 81 98 30.6354 -6.0471 9.5644 9.5620 0.27343
14 76 61 67 57 86 -7.0741 0.2460 10.6561 -7.2838 -7.43512
15 87 82 78 82 97 28.4137 -6.0653 4.7354 -9.3994 2.06487
16 77 80 78 70 94 16.6492 1.5159 0.4095 -3.9394 -1.48414
17 38 43 45 12 96 -65.5458 24.4263 -2.0626 0.6147 4.78062
18 67 73 78 67 95 6.7767 -0.0789 -2.3094 2.9936 0.39557
19 83 77 80 67 100 17.7240 4.6750 8.1687 -3.3828 3.24343
20 47 61 56 21 95 -41.7045 27.6939 -5.6924 -0.2520 0.90926
21 70 62 88 51 96 -1.1890 10.2005 11.4991 13.9747 -2.17790
22 81 51 63 66 92 -7.4938 -10.9975 18.4155 -9.7562 0.63245
23 51 16 36 48 84 -63.3775 -20.9237 14.2463 -7.9952 0.58878
SAS システム 90
16:28 Wednesday, October 8, 1997
プロット : PRIN2*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ...
50 + |
| |
PRIN2 | A |
| A |
| A A| A A
0 +---------------------------------A--+AAA-A-AA---A-----AA---------
| A A | A AA
| A |
| |
| A |
-50 + |
-+--------+--------+--------+--------+--------+--------+--------+-
-80 -60 -40 -20 0 20 40 60
PRIN1
SAS システム 91
16:28 Wednesday, October 8, 1997
プロット : PRIN3*PRIN2. 凡例: A = 1 OBS, B = 2 OBS, ...
20 + A |
| A |
PRIN3 | A A A
| A | A
| A A A
0 +----------------------------+B---------------------------
| A A A A
| A A | A A
| |
| A A | A
-20 + |
---+------------+------------+------------+------------+--
-40 -20 0 20 40
PRIN2
SAS システム 92
16:28 Wednesday, October 8, 1997
プロット : PRIN3*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ...
20 + A |
| A |
PRIN3 | A A| A
| | A A
| |A A A
0 +------------------------------------+----A-A---------------------
| A A | A A
| A A | A A
| |
| A | A A
-20 + |
-+--------+--------+--------+--------+--------+--------+--------+-
-80 -60 -40 -20 0 20 40 60
PRIN1
/* Lesson 14-3 */ /* File Name = prin03.sas 10/09/97 */ data tyuuni; infile 'seiseki.dat'; input koku sya suu rika eigo; proc print data=tyuuni; run; : proc princomp data=tyuuni out=out_prin; : 相関係数を使って var koku sya suu rika eigo; : run; : proc print data=out_prin; run; proc plot data=out_prin; plot prin2*prin1/vref=0 href=0; plot prin3*prin2/vref=0 href=0; plot prin3*prin1/vref=0 href=0; run;
SAS システム 95
16:28 Wednesday, October 8, 1997
Principal Component Analysis
23 Observations
5 Variables
Simple Statistics
KOKU SYA SUU
Mean 67.13043478 68.65217391 72.86956522
StD 15.83811383 19.41791276 15.81513759
RIKA EIGO
Mean 63.30434783 93.78260870
StD 18.48447698 5.10753918
Correlation Matrix
KOKU SYA SUU RIKA EIGO
KOKU 1.0000 0.6585 0.7018 0.5576 0.5876
SYA 0.6585 1.0000 0.7300 0.5186 0.6317
SUU 0.7018 0.7300 1.0000 0.5842 0.6862
RIKA 0.5576 0.5186 0.5842 1.0000 0.1582
EIGO 0.5876 0.6317 0.6862 0.1582 1.0000
Eigenvalues of the Correlation Matrix
Eigenvalue Difference Proportion Cumulative
PRIN1 3.36123 2.51380 0.672246 0.67225
PRIN2 0.84743 0.50592 0.169486 0.84173
PRIN3 0.34151 0.05968 0.068302 0.91003
PRIN4 0.28183 0.11383 0.056366 0.96640
PRIN5 0.16800 . 0.033600 1.00000
Eigenvectors
PRIN1 PRIN2 PRIN3 PRIN4 PRIN5
KOKU 0.469962 0.057099 -.801071 -.340151 -.135862
SYA 0.475925 -.059026 0.569510 -.663253 -.075929
SUU 0.497565 -.028416 0.156983 0.567233 -.636572
RIKA 0.366295 0.763815 0.094762 0.222034 0.473430
EIGO 0.413386 -.639559 -.017882 0.270815 0.588571
SAS システム 100
16:28 Wednesday, October 8, 1997
OBS KOKU SYA SUU RIKA EIGO PRIN1 PRIN2 PRIN3 PRIN4 PRIN5
1 29 33 55 79 84 -3.04820 1.87656 0.82028 1.06560 0.46046
2 71 68 72 64 97 0.34567 -0.35664 -0.23117 0.08693 0.39293
3 74 91 79 76 100 1.69924 -0.30811 0.41216 -0.20883 0.64856
4 52 56 58 60 85 -2.00319 0.97383 0.26041 -0.28158 -0.31893
5 77 92 96 88 98 2.42354 0.41543 0.52702 0.34041 0.01153
6 60 85 66 66 88 -0.44163 0.77242 0.80599 -0.92587 -0.32357
7 81 91 73 63 95 1.05589 -0.18318 -0.05059 -0.99563 -0.07912
8 61 84 72 78 92 0.31384 0.76328 0.83316 -0.34175 0.19854
9 70 75 81 67 96 0.74923 -0.14851 0.13292 0.17512 -0.02652
10 53 70 73 51 92 -0.77026 -0.34050 0.69869 0.01980 -0.40987
11 69 64 96 57 97 0.80464 -0.68406 -0.04499 1.04322 -0.71958
12 87 89 90 85 100 2.56039 0.09697 -0.14870 0.08293 0.33262
13 83 75 96 81 98 2.04620 0.19948 -0.31094 0.70814 -0.15275
14 76 61 67 57 86 -0.86386 0.77980 -0.73638 -0.62802 -0.86821
15 87 82 78 82 97 1.70903 0.39150 -0.47799 -0.30348 0.42046
16 77 80 78 70 94 0.88268 0.24132 -0.08188 -0.32361 -0.13900
17 38 43 45 12 96 -3.20712 -2.37462 0.17361 0.00355 0.41347
18 67 73 78 67 95 0.43587 -0.02263 0.19972 0.14725 0.01255
19 83 77 80 67 100 1.47628 -0.60680 -0.48987 0.00384 0.35534
20 47 61 56 21 95 -2.05540 -1.91954 0.40515 -0.35495 -0.06161
21 70 62 88 51 96 0.33377 -0.78272 -0.26090 0.67804 -0.66724
22 81 51 63 66 92 -0.42247 0.45600 -1.29713 -0.11106 0.21093
23 51 16 36 48 84 -4.02413 0.76070 -1.13857 0.11994 0.30900
SAS システム 102
16:28 Wednesday, October 8, 1997
プロット : PRIN2*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ...
PRIN2 | |
2 + A |
| |
| A A A B | A A A
0 +----------------------------+--A-AAA---A-A---A-----------
| A | B A A
| |
-2 + A |
| A |
| |
-4 + |
---+------------+------------+------------+------------+--
-4 -2 0 2 4
PRIN1
SAS システム 103
16:28 Wednesday, October 8, 1997
プロット : PRIN3*PRIN2. 凡例: A = 1 OBS, B = 2 OBS, ...
PRIN3 | |
1 + |
| A | A B A
| A A A A A
0 +------------------------------A-----B-+A-A-----------------------
| A A A | A A
| | A
-1 + | A
| | A
| |
-2 + |
---+-----------+-----------+-----------+-----------+-----------+--
-3 -2 -1 0 1 2
PRIN2
SAS システム 104
16:28 Wednesday, October 8, 1997
プロット : PRIN3*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ...
PRIN3 | |
1 + |
| A A A | A A
| A B | A A
0 +----------------------------+----BAA---------A-----------
| | B AA A
| A |
-1 + A |
| A |
| |
-2 + |
---+------------+------------+------------+------------+--
-4 -2 0 2 4
PRIN1
/* Lesson 14-4 */ /* File Name = prin04.sas 10/09/97 */ data gakusei; infile 'all.dat'; input seibetsu $ shintyou taijyuu kyoui shiokuri $ kodukai; proc print data=gakusei(obs=10); run; proc princomp data=gakusei out=out_prin; var shintyou taijyuu kyoui kodukai; run; proc print data=out_prin(obs=10); run; proc plot data=out_prin; plot prin2*prin1/vref=0 href=0; plot prin3*prin2/vref=0 href=0; plot prin3*prin1/vref=0 href=0; run;
SAS システム 228
16:28 Wednesday, October 8, 1997
OBS SEIBETSU SHINTYOU TAIJYUU KYOUI SHIOKURI KODUKAI
1 F 148.9 . . J 60000
2 F 156.0 . . G .
3 M 156.0 61 90 J 0
4 F 156.5 . . J 20000
5 F 157.0 43 . J 20000
6 F 160.0 . . J 43500
7 F 161.0 . . J 25000
8 F 161.0 . . .
9 F 162.0 . . J 0
10 F 162.0 . . J 30000
SAS システム 229
16:28 Wednesday, October 8, 1997
Principal Component Analysis
42 Observations
4 Variables
Simple Statistics
SHINTYOU TAIJYUU KYOUI KODUKAI
Mean 163.9214286 56.38095238 85.95238095 53833.33333
StD 8.7977696 9.76292734 6.99111764 49036.03288
Correlation Matrix
SHINTYOU TAIJYUU KYOUI KODUKAI
SHINTYOU 1.0000 0.7962 0.5647 0.2157
TAIJYUU 0.7962 1.0000 0.7754 0.1919
KYOUI 0.5647 0.7754 1.0000 -.0643
KODUKAI 0.2157 0.1919 -.0643 1.0000
Eigenvalues of the Correlation Matrix
Eigenvalue Difference Proportion Cumulative
PRIN1 2.45777 1.41479 0.614443 0.61444
PRIN2 1.04298 0.67066 0.260744 0.87519
PRIN3 0.37232 0.24539 0.093081 0.96827
PRIN4 0.12693 . 0.031732 1.00000
Eigenvectors
PRIN1 PRIN2 PRIN3 PRIN4
SHINTYOU 0.563145 0.098142 -.713891 0.404469
TAIJYUU 0.611881 -.012170 0.033325 -.790154
KYOUI 0.537423 -.333502 0.631891 0.447958
KODUKAI 0.140164 0.937548 0.299937 0.106750
SAS システム 233
16:28 Wednesday, October 8, 1997
S S S
E H T H K
I I A I O
B N I K O D P P P P
E T J Y K U R R R R
O T Y Y O U K I I I I
B S O U U R A N N N N
S U U U I I I 1 2 3 4
1 F 148.9 . . J 60000 . . . .
2 F 156.0 . . G . . . . .
3 M 156.0 61 90 J 0 -0.060283 -1.31648 0.69511 -0.59586
4 F 156.5 . . J 20000 . . . .
5 F 157.0 43 . J 20000 . . . .
6 F 160.0 . . J 43500 . . . .
7 F 161.0 . . J 25000 . . . .
8 F 161.0 . . . . . . .
9 F 162.0 . . J 0 . . . .
10 F 162.0 . . J 30000 . . . .
SAS システム 235
16:28 Wednesday, October 8, 1997
プロット : PRIN2*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ...
(NOTE: 232 オブザベーションが欠損値です.)
2 + | A
| | A A A
PRIN2 | A A A A | A A
| AA A A | A
| |
0 +---------------A--A-------+A----------------A--------------------
| A A B | A A
| A BA AA A | A B A
| A A A A A
| | A
-2 + |
---+-----------+-----------+-----------+-----------+-----------+--
-4 -2 0 2 4 6
PRIN1
SAS システム 236
16:28 Wednesday, October 8, 1997
プロット : PRIN3*PRIN2. 凡例: A = 1 OBS, B = 2 OBS, ...
(NOTE: 232 オブザベーションが欠損値です.)
PRIN3 | |
1 + A A A | A
| A A A| A A
| A |AA A B A A A
0 +-------------B---A--A-----+-A------------------------------------
| A CAABA | B A
| A AA | A A
-1 + | A
| |
| A |
-2 + |
---+-----------+-----------+-----------+-----------+-----------+--
-2 -1 0 1 2 3
PRIN2
SAS システム 237
16:28 Wednesday, October 8, 1997
プロット : PRIN3*PRIN1. 凡例: A = 1 OBS, B = 2 OBS, ...
(NOTE: 232 オブザベーションが欠損値です.)
PRIN3 | |
1 + A A | A A
| A A A A A
| A AAA |A A A AA
0 +---------------B--A-------+-----A--A-----------------------------
| A BA CA | AA B
| A A A | A A
-1 + | A
| |
| | A
-2 + |
---+-----------+-----------+-----------+-----------+-----------+--
-4 -2 0 2 4 6
PRIN1
明確に決まっているわけではないが、以下のような基準が一般的に 用いられている。また、結果の解釈の都合上、多少増減させることもある。