/* Lesson 11-1 */ /* File Name = les1101.sas 07/01/04 */ data gakusei; infile 'all04a.prn' firstobs=2; input sex $ shintyou taijyuu kyoui jitaku $ kodukai carryer $ tsuuwa; if sex^='M' & sex^='F' then delete; proc print data=gakusei(obs=10); run; proc reg data=gakusei; : 回帰分析 model taijyuu=shintyou; : 変量を指定 output out=outreg1 predicted=pred1 residual=resid1; : 結果項目の保存 run; : : proc print data=outreg1(obs=15); : 表示してみる run; : : proc plot data=outreg1; : 散布図を描く plot taijyuu*shintyou/vaxis=20 to 100 by 20; : 体重と身長(縦軸指定) plot pred1*taijyuu; : 予測値と観測値 plot resid1*pred1 /vref=0; : 残差と予測値(残差解析)(水平軸指定) plot resid1*shintyou/vref=0; : 残差と説明変数(残差解析) plot resid1*taijyuu /vref=0; : 残差と目的変数(残差解析) run; : : proc univariate data=outreg1 plot normal; : 残差を正規プロットして確かめる var resid1; : run; :[補足] proc plot の下に以下の行を追加した方がより正確ではある。 欠損値を含むデータを解析対象から除外する事を指示する命令文である。 「欠損値です」の表示が無くなるだけで、得られる図は同じ(欠損値は描画できないから)。 試しに追加する/しないの両方で実行してみよ。
where shintyou^=. and taijyuu^=.;
SAS システム 2 14:49 Wednesday, June 30, 2004 Model: MODEL1 Dependent Variable: TAIJYUU Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model 1 10789.17582 10789.17582 252.411 0.0001 Error 251 10728.86228 42.74447 C Total 252 21518.03810 Root MSE 6.53793 R-square 0.5014 Dep Mean 58.72530 Adj R-sq 0.4994 C.V. 11.13307 SAS システム 3 14:49 Wednesday, June 30, 2004 Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T| INTERCEP 1 -78.584367 8.65241903 -9.082 0.0001 SHINTYOU 1 0.814033 0.05123749 15.887 0.0001 SAS システム 4 14:49 Wednesday, June 30, 2004 S H T K C I A J O A T R N I K I D R S P E T J Y T U R U R S O S Y Y O A K Y U E I B E O U U K A E W D D S X U U I U I R A 1 1 1 F 145.0 38.0 . J 10000 . 39.4504 -1.4504 2 F 148.0 42.0 . J 50000 . 41.8925 0.1075 3 F 148.0 43.0 80 J 50000 DoCoMo 4000 41.8925 1.1075 4 F 148.9 . . J 60000 . 42.6251 . 5 F 149.0 45.0 . G 60000 . 42.7065 2.2935 6 F 150.0 46.0 86 40000 . 43.5206 2.4794 7 F 151.0 50.0 . G 60000 J-PHONE . 44.3346 5.6654 8 F 151.7 41.5 80 J 35000 . 44.9044 -3.4044 9 F 152.0 35.0 77 J 60000 DoCoMo 2000 45.1486 -10.1486 10 F 152.0 43.0 . J 20000 au 3500 45.1486 -2.1486 11 F 153.0 41.0 . J 125000 No . 45.9627 -4.9627 12 F 153.0 42.0 . G 0 Vodafone 1000 45.9627 -3.9627 13 F 153.0 46.5 87 G 10000 . 45.9627 0.5373 14 F 153.0 50.0 . G 70000 DoCoMo 10000 45.9627 4.0373 15 F 153.0 55.0 78 J 30000 . 45.9627 9.0373 SAS システム 6 14:49 Wednesday, June 30, 2004 プロット : TAIJYUU*SHINTYOU. 凡例: A = 1 OBS, B = 2 OBS, ... (NOTE: 40 オブザベーションが欠損値です.) TAIJYUU | 100 + A | A A 80 + A A A A B A A | A B CBDDC DCEAD CCF B AA 60 + A AA D B CABBF HBOHKBIFFDC BADBB A | AAA CABDA CCH F EBCGF DAAAB BA 40 + A B C BA BA | 20 + | --+-----------+-----------+-----------+-----------+-----------+- 140 150 160 170 180 190 SHINTYOU SAS システム 7 14:49 Wednesday, June 30, 2004 プロット : PRED1*TAIJYUU. 凡例: A = 1 OBS, B = 2 OBS, ... (NOTE: 40 オブザベーションが欠損値です.) 80 + | PRED1 | A B A | A ADAAFAB F B A A A | ABBBBLFDDBGBA A BB 60 + BEBJHGGJBGBAADABA A | AF EHCF BCAAD A | BBDCCEAC AAA | BAABBACA A A | A CAAB B A 40 + A AA ---+------------+------------+------------+------------+-- 20 40 60 80 100 TAIJYUU SAS システム 8 14:49 Wednesday, June 30, 2004 プロット : RESID1*PRED1. 凡例: A = 1 OBS, B = 2 OBS, ... (NOTE: 40 オブザベーションが欠損値です.) | R 50 + e | s | A A i 25 + A d | A B A A AAA A u | A A A A BBAB BBBCDCDAA BA A A a 0 +-------------A--BAA-BCBCDABBI-DEBCCGHBMHHHGHBEBBHA-AA------------ l | AA BAA C BA AGDDACCEBBCE BBACA | A -25 + ---+-----------+-----------+-----------+-----------+-----------+-- 30 40 50 60 70 80 Predicted Value of TAIJYUU SAS システム 9 14:49 Wednesday, June 30, 2004 プロット : RESID1*SHINTYOU. 凡例: A = 1 OBS, B = 2 OBS, ... (NOTE: 40 オブザベーションが欠損値です.) | R 50 + e | s | A A i 25 + A d | A B A A A B A u | A A A A B BAB B BBCDC CBA BA A A a 0 +--------A---BAA-B-DBBDA-BBI-D-EBCCG-HBMFIAHGHBE-BBH-A--AA-------- l | A A BA AAB B A AFE DACCDABBCCB BABAB A | A -25 + ---+-----------+-----------+-----------+-----------+-----------+-- 140 150 160 170 180 190 SHINTYOU SAS システム 10 14:49 Wednesday, June 30, 2004 プロット : RESID1*TAIJYUU. 凡例: A = 1 OBS, B = 2 OBS, ... (NOTE: 40 オブザベーションが欠損値です.) | R 50 + e | s | A A i 25 + A d | A BABB A u | A AAABBAIBCCFAC A A a 0 +--------------A-CBBDDDKIDMGISOFKCI-E--------------------- l | A CABCH CKDHCCFCAA | A -25 + ---+------------+------------+------------+------------+-- 20 40 60 80 100 TAIJYUU SAS システム 11 14:49 Wednesday, June 30, 2004 Univariate Procedure Variable=RESID1 Residual Moments N 253 Sum Wgts 253 Mean 0 Sum 0 Std Dev 6.524941 Variance 42.57485 Skewness 1.414355 Kurtosis 4.06384 USS 10728.86 CSS 10728.86 CV . Std Mean 0.41022 T:Mean=0 0 Pr>|T| 1.0000 Num ^= 0 253 Num > 0 110 M(Sign) -16.5 Pr>=|M| 0.0440 Sgn Rank -1902.5 Pr>=|S| 0.1026 W:Normal 0.921391 Pr< W 0.0001 Missing Value . Count 40 % Count/Nobs 13.65 SAS システム 15 14:49 Wednesday, June 30, 2004 Univariate Procedure Variable=RESID1 Residual Histogram # Boxplot 35+* 1 * .* 3 0 .***** 13 0 .******************************* 93 +--+--+ .*********************************************** 139 *-----* -15+** 4 | ----+----+----+----+----+----+----+----+----+-- * may represent up to 3 counts SAS システム 16 14:49 Wednesday, June 30, 2004 Univariate Procedure Variable=RESID1 Residual Normal Probability Plot 35+ * | ** * | *******++++ | ++************** | ************************ -15+*+**++++++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2
[注意] 誤差は「説明変量」の軸と垂直に取ることに注意せよ。 誤差は測定時に混入していると考えてモデルが構築されているから。
/* Lesson 11-2 */ /* File Name = les1102.sas 07/01/04 */ data gakusei; infile 'all04a.prn' firstobs=2; input sex $ shintyou taijyuu kyoui jitaku $ kodukai carryer $ tsuuwa; if sex^='M' & sex^='F' then delete; proc print data=gakusei(obs=10); run; proc reg data=gakusei; : 回帰分析 model taijyuu=shintyou kyoui; : 複数変量を指定 output out=outreg1 predicted=pred1 residual=resid1; : 結果項目の保存 run; : proc print data=outreg1(obs=15); run; : proc plot data=outreg1; : 散布図を描く where shintyou^=. and taijyuu^=. and kyoui^=.; : 解析に使ったデータのみ plot taijyuu*shintyou; : plot taijyuu*kyoui; : plot taijyuu*pred1; : 観測値と予測値 plot resid1*pred1 /vref=0; : 残差と予測値(残差解析) plot resid1*shintyou/vref=0; : 残差と説明変量(残差解析) plot resid1*kyoui /vref=0; : 残差と説明変量(残差解析) plot resid1*taijyuu /vref=0; : 残差と目的変量(残差解析) run; : : proc univariate data=outreg1 plot normal; : 残差を正規プロットして確かめる var resid1; : run; :
SAS システム 2 19:47 Wednesday, June 23, 2004 Model: MODEL1 Dependent Variable: TAIJYUU Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model 2 7682.00845 3841.00423 102.149 0.0001 Error 90 3384.18983 37.60211 C Total 92 11066.19828 Root MSE 6.13206 R-square 0.6942 Dep Mean 59.19570 Adj R-sq 0.6874 C.V. 10.35896 SAS システム 3 19:47 Wednesday, June 23, 2004 Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T| INTERCEP 1 -109.642478 12.60968451 -8.695 0.0001 SHINTYOU 1 0.672459 0.08035699 8.368 0.0001 KYOUI 1 0.646057 0.08814219 7.330 0.0001 SAS システム 4 19:47 Wednesday, June 23, 2004 S H T K C I A J O A T R N I K I D R S P E T J Y T U R U R S O S Y Y O A K Y U E I B E O U U K A E W D D S X U U I U I R A 1 1 1 F 145.0 38.0 . J 10000 . . . 2 F 148.0 42.0 . J 50000 . . . 3 F 148.0 43.0 80 J 50000 DoCoMo 4000 41.5660 1.4340 4 F 148.9 . . J 60000 . . . 5 F 149.0 45.0 . G 60000 . . . 6 F 150.0 46.0 86 40000 . 46.7873 -0.7873 7 F 151.0 50.0 . G 60000 J-PHONE . . . 8 F 151.7 41.5 80 J 35000 . 44.0541 -2.5541 9 F 152.0 35.0 77 J 60000 DoCoMo 2000 42.3177 -7.3177 10 F 152.0 43.0 . J 20000 au 3500 . . 11 F 153.0 41.0 . J 125000 No . . . 12 F 153.0 42.0 . G 0 Vodafone 1000 . . 13 F 153.0 46.5 87 G 10000 . 49.4507 -2.9507 14 F 153.0 50.0 . G 70000 DoCoMo 10000 . . 15 F 153.0 55.0 78 J 30000 . 43.6362 11.3638 SAS システム 6 19:47 Wednesday, June 23, 2004 プロット : TAIJYUU*SHINTYOU. 凡例: A = 1 OBS, B = 2 OBS, ... 100 + A | A A TAIJYUU | A A A | B BABAB AAAAA A B A AA | A A A A B BA BAFBC ABA ABBA 50 + A A ACA ABD C BBACB A | A B A A | | | 0 + --+-----------+-----------+-----------+-----------+-----------+- 140 150 160 170 180 190 SHINTYOU SAS システム 7 19:47 Wednesday, June 23, 2004 プロット : TAIJYUU*KYOUI. 凡例: A = 1 OBS, B = 2 OBS, ... 100 + A | A A TAIJYUU | AA A | A C BBF BAAA A A | A A B C AAC FBI AAA A 50 + A A AA B HCFBBA | A A B A | | | 0 + ---+-------+-------+-------+-------+-------+-------+-------+-- 50 60 70 80 90 100 110 120 KYOUI SAS システム 8 19:47 Wednesday, June 23, 2004 プロット : TAIJYUU*PRED1. 凡例: A = 1 OBS, B = 2 OBS, ... 100 + A | A A TAIJYUU | A A A | A A CBBBB AA CB A | A B AA B AABB BFDABB AB 50 + B BBBC DBCDB B | BAA A | | | 0 + --+---------+---------+---------+---------+---------+---------+- 30 40 50 60 70 80 90 Predicted Value of TAIJYUU SAS システム 9 19:47 Wednesday, June 23, 2004 プロット : RESID1*PRED1. 凡例: A = 1 OBS, B = 2 OBS, ... | R 40 + e | s | A i 20 + A d | A A A A u | A B AA B A A BABA A A a 0 +--------------AAA--BBBB-CAAAAAABB-BDCAB-AA-BB--------A----------- l | A B AABCA B ABAABB ABA A | -20 + ---+---------+---------+---------+---------+---------+---------+-- 30 40 50 60 70 80 90 Predicted Value of TAIJYUU SAS システム 10 19:47 Wednesday, June 23, 2004 プロット : RESID1*SHINTYOU. 凡例: A = 1 OBS, B = 2 OBS, ... | R 40 + e | s | A i 20 + A d | A A A A u | B A A BBBAB AAAA a 0 +------------A-A-A-AAABA-AAC-B-BAABB-A-CAB-BAB-A-B-B-A--A--------- l | A A AA B AA CC A BAA A ABBA A | -20 + ---+-----------+-----------+-----------+-----------+-----------+-- 140 150 160 170 180 190 SHINTYOU SAS システム 11 19:47 Wednesday, June 23, 2004 プロット : RESID1*KYOUI. 凡例: A = 1 OBS, B = 2 OBS, ... | R 40 + e | s | A i 20 + A d | A A A A u | B A A A A B ABD B a 0 +-----------------------B---D-BCDGCCAG-A-AB---B--------A---------- l | AA B CA FABBC AAB A | -20 + -+--------+--------+--------+--------+--------+--------+--------+- 50 60 70 80 90 100 110 120 KYOUI SAS システム 12 19:47 Wednesday, June 23, 2004 プロット : RESID1*TAIJYUU. 凡例: A = 1 OBS, B = 2 OBS, ... | R 40 + e | s | A i 20 + A d | AA A A u | A B AAB B ABBA AA a 0 +----------------AAABCBDCAB-AEDBDAA-D----A---------------- l | A A BDABB B AEAAC A | -20 + ---+------------+------------+------------+------------+-- 20 40 60 80 100 TAIJYUU SAS システム 17 19:47 Wednesday, June 23, 2004 Univariate Procedure Variable=RESID1 Residual Stem Leaf # Boxplot 2 4 1 * 1 8 1 0 1 01134 5 0 0 5556777778888 13 | 0 0000111111233444 16 +--+--+ -0 44433333333333332222222221111111000 35 *-----* -0 998777776666666555555 21 | -1 0 1 | ----+----+----+----+----+----+----+ Multiply Stem.Leaf by 10**+1 SAS システム 18 19:47 Wednesday, June 23, 2004 Univariate Procedure Variable=RESID1 Residual Normal Probability Plot 22.5+ * | * + | *+**+++++ | ********+ | +++****** | ************ |* * ** *+******* -12.5+ ++++++++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2
/* Lesson 11-3 */ /* File Name = les1103.sas 07/01/04 */ data gakusei; infile 'all04a.prn' firstobs=2; input sex $ shintyou taijyuu kyoui jitaku $ kodukai carryer $ tsuuwa; if sex^='M' & sex^='F' then delete; : 性別不明は除外 if shintyou=. | taijyuu=. | kyoui=. then delete; : 欠損のあるデータは除外 proc print data=gakusei(obs=10); run; proc corr data=gakusei; : 相関係数 where sex='M'; : 男性について run; : : proc reg data=gakusei; : 回帰分析 model taijyuu=shintyou kyoui; : where sex='M'; : 男性について output out=outreg1 predicted=pred1 residual=resid1; : run; : proc print data=outreg1(obs=15); run; proc plot data=outreg1; where sex='M'; : 対象データについて plot taijyuu*shintyou; plot taijyuu*kyoui; plot taijyuu*pred1; plot resid1*(pred1 shintyou kyoui taijyuu)/vref=0; : まとめて記述 /* plot resid1*pred1 /vref=0; plot resid1*shintyou/vref=0; plot resid1*kyoui /vref=0; plot resid1*taijyuu /vref=0; */ run; proc univariate data=outreg1 plot normal; var resid1; run;
SAS システム 2 14:53 Tuesday, June 29, 2004 Correlation Analysis 5 'VAR' Variables: SHINTYOU TAIJYUU KYOUI KODUKAI TSUUWA Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum SHINTYOU 61 172.3 6.2101 10513.1 156.0 185.0 TAIJYUU 61 64.6344 9.2524 3942.7 46.0000 100.0 KYOUI 61 88.7049 8.6146 5411.0 56.0000 112.0 KODUKAI 57 54491.2 57395.6 3106000 0 300000 TSUUWA 5 8200.0 3271.1 41000.0 5000.0 13000.0 SAS システム 3 14:53 Tuesday, June 29, 2004 Correlation Analysis Pearson Correlation Coefficients / Prob > |R| under Ho: Rho=0 / Number of Observations SHINTYOU TAIJYUU KYOUI KODUKAI TSUUWA SHINTYOU 1.00000 0.42019 0.22042 0.11293 -0.19869 0.0 0.0007 0.0878 0.4029 0.7487 61 61 61 57 5 TAIJYUU 0.42019 1.00000 0.66894 -0.08201 0.17683 0.0007 0.0 0.0001 0.5442 0.7760 61 61 61 57 5 KYOUI 0.22042 0.66894 1.00000 -0.11888 0.14486 0.0878 0.0001 0.0 0.3785 0.8162 61 61 61 57 5 KODUKAI 0.11293 -0.08201 -0.11888 1.00000 -0.58004 0.4029 0.5442 0.3785 0.0 0.3053 57 57 57 57 5 TSUUWA -0.19869 0.17683 0.14486 -0.58004 1.00000 0.7487 0.7760 0.8162 0.3053 0.0 5 5 5 5 5 SAS システム 6 14:53 Tuesday, June 29, 2004 Model: MODEL1 Dependent Variable: TAIJYUU Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model 2 2700.06291 1350.03146 32.138 0.0001 Error 58 2436.39479 42.00681 C Total 60 5136.45770 Root MSE 6.48127 R-square 0.5257 Dep Mean 64.63443 Adj R-sq 0.5093 C.V. 10.02758 SAS システム 7 14:53 Tuesday, June 29, 2004 Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T| INTERCEP 1 -66.687175 23.51117249 -2.836 0.0063 SHINTYOU 1 0.427106 0.13813439 3.092 0.0031 KYOUI 1 0.650603 0.09957815 6.534 0.0001 SAS システム 10 14:53 Tuesday, June 29, 2004 プロット : TAIJYUU*SHINTYOU. 凡例: A = 1 OBS, B = 2 OBS, ... TAIJYUU | 100 + A | A A | 75 + A A A A A AA | B B A C A A A A A A D A A A | A A A B A B A D B C A AAA A AA A 50 + A B A | | 25 + --+---------+---------+---------+---------+---------+---------+- 155 160 165 170 175 180 185 SHINTYOU SAS システム 11 14:53 Tuesday, June 29, 2004 プロット : TAIJYUU*KYOUI. 凡例: A = 1 OBS, B = 2 OBS, ... TAIJYUU | 100 + A | A A | 75 + AA BA A A | A C BAH BA B A | A A A C AAC EBE AA A 50 + A A A A | | 25 + ---+-------+-------+-------+-------+-------+-------+-------+-- 50 60 70 80 90 100 110 120 KYOUI SAS システム 12 14:53 Tuesday, June 29, 2004 プロット : TAIJYUU*PRED1. 凡例: A = 1 OBS, B = 2 OBS, ... TAIJYUU | 100 + A | A A | 75 + AA AAAA A | AA A DABACC A AB | A AA AAAAAABBDDB BA 50 + A A A A | | 25 + --+-----------+-----------+-----------+-----------+-----------+- 40 50 60 70 80 90 Predicted Value of TAIJYUU SAS システム 13 14:53 Tuesday, June 29, 2004 プロット : RESID1*PRED1. 凡例: A = 1 OBS, B = 2 OBS, ... | R 40 + e | s | i 20 + A A d | A A u | A A A AA A CBA AA a 0 +---------------A---A----A-AAAAABBAAC--A-BA---------A------------- l | A A A AADBB BD B | -20 + ---+-----------+-----------+-----------+-----------+-----------+-- 40 50 60 70 80 90 Predicted Value of TAIJYUU SAS システム 14 14:53 Tuesday, June 29, 2004 プロット : RESID1*SHINTYOU. 凡例: A = 1 OBS, B = 2 OBS, ... | R 40 + e | s | i 20 + A A d | A A u | A B A C A B A A AA a 0 +----A-------A-----------A-B---A-B-A-A-A--AB---A-AA--B-A-A---A---- l | A B A A B C A B A A A BA A A | -20 + ---+---------+---------+---------+---------+---------+---------+-- 155 160 165 170 175 180 185 SHINTYOU SAS システム 15 14:53 Tuesday, June 29, 2004 プロット : RESID1*KYOUI. 凡例: A = 1 OBS, B = 2 OBS, ... | R 40 + e | s | i 20 + A A d | A A u | A A A A B BD B a 0 +-----------------------B---B---ABABAE-A-AB---A--------A---------- l | A B BBBAE AAB A A | -20 + -+--------+--------+--------+--------+--------+--------+--------+- 50 60 70 80 90 100 110 120 KYOUI SAS システム 16 14:53 Tuesday, June 29, 2004 プロット : RESID1*TAIJYUU. 凡例: A = 1 OBS, B = 2 OBS, ... | R 40 + e | s | i 20 + A A d | A A u | A A A B A AC A A AA a 0 +----------A-------A---DACA-DA-A-BB------A------------------------ l | A A BA B FABABA A | -20 + ---+---------+---------+---------+---------+---------+---------+-- 40 50 60 70 80 90 100 TAIJYUU SAS システム 17 14:53 Tuesday, June 29, 2004 Univariate Procedure Variable=RESID1 Residual Moments N 61 Sum Wgts 61 Mean 0 Sum 0 Std Dev 6.372329 Variance 40.60658 Skewness 1.224565 Kurtosis 1.785444 USS 2436.395 CSS 2436.395 CV . Std Mean 0.815893 T:Mean=0 0 Pr>|T| 1.0000 Num ^= 0 61 Num > 0 24 M(Sign) -6.5 Pr>=|M| 0.1237 Sgn Rank -115.5 Pr>=|S| 0.4113 W:Normal 0.909005 Pr< W 0.0001 SAS システム 20 14:53 Tuesday, June 29, 2004 Univariate Procedure Variable=RESID1 Residual Stem Leaf # Boxplot 2 2 1 0 1 8 1 0 1 024 3 | 0 5555566777 10 | 0 001123444 9 +--+--+ -0 44443333322211111100 20 *-----* -0 99888766655555555 17 +-----+ ----+----+----+----+ Multiply Stem.Leaf by 10**+1 SAS システム 21 14:53 Tuesday, June 29, 2004 Univariate Procedure Variable=RESID1 Residual Normal Probability Plot 22.5+ * | * ++ | **++++++ 7.5+ ++*****+ | +++******* | *********** -7.5+ * * **+******* +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2
where sex='M' and taijyuu<85;