これまでに分布特性を把握するためのいくつかの指標を説明し、
その使い方や注意点を喚起した。
それらを踏まえて、データの特性を考慮し、グループ毎の集計を行なうと、
今までは判らなかったデータの特徴を把握することができる。
今週は、
先週
説明し残した、第3節から再開した後、グループ毎の集計を紹介する。
/* Lesson 7-01 */ /* File Name = les0701.sas 11/14/07 */ data gakusei; infile 'all07be.prn' firstobs=2; input sex $ shintyou taijyuu kyoui jitaku $ kodukai carryer $ tsuuwa; proc print data=gakusei(obs=5); run; proc means data=gakusei; run; proc univariate data=gakusei plot; var shintyou taijyuu kyoui kodukai; run; proc chart data=gakusei; : ヒストグラム hbar shintyou taijyuu kyoui kodukai; : 指定した変量の水平棒グラフを表示 run; : : proc sort data=gakusei; : 並べ替え(ソート) by sex; : 性別ごとに run; : : proc means data=gakusei; : 平均の計算 by sex; : 性別ごとに run; : proc univariate data=gakusei plot; : 基礎統計量の計算 var shintyou taijyuu kyoui kodukai; : 指定した変量について計算 by sex; : 性別ごとに run; : proc chart data=gakusei; : ヒストグラム hbar shintyou taijyuu kyoui kodukai; : 指定した変量の水平棒グラフを表示 by sex; : 性別ごとに run; : : proc chart data=gakusei; : ヒストグラム hbar shintyou taijyuu kyoui kodukai/group=sex; : 性別ごとに併置して run; :
SAS システム 2 17:34 Saturday, November 10, 2007 Variable N Mean Std Dev Minimum Maximum --------------------------------------------------------------------- SHINTYOU 366 167.8090164 8.1781500 145.0000000 186.0000000 TAIJYUU 330 58.6963636 9.3175123 35.0000000 100.0000000 KYOUI 115 86.1217391 8.3457604 46.0000000 112.0000000 KODUKAI 353 48257.79 49720.96 0 350000.00 TSUUWA 153 6501.84 4410.93 0 30000.00 --------------------------------------------------------------------- SAS システム 3 17:34 Saturday, November 10, 2007 Univariate Procedure Variable=SHINTYOU Moments N 366 Sum Wgts 366 Mean 167.809 Sum 61418.1 Std Dev 8.17815 Variance 66.88214 Skewness -0.33766 Kurtosis -0.42295 USS 10330923 CSS 24411.98 CV 4.873487 Std Mean 0.427479 T:Mean=0 392.5552 Pr>|T| 0.0001 Num ^= 0 366 Num > 0 366 M(Sign) 183 Pr>=|M| 0.0001 Sgn Rank 33580.5 Pr>=|S| 0.0001 SAS システム 4 17:34 Saturday, November 10, 2007 Univariate Procedure Variable=SHINTYOU Quantiles(Def=5) 100% Max 186 99% 184 75% Q3 173.5 95% 180 50% Med 169 90% 178 25% Q1 162 10% 156 0% Min 145 5% 153 1% 148 Range 41 Q3-Q1 11.5 Mode 170 SAS システム 7 17:34 Saturday, November 10, 2007 Univariate Procedure Variable=SHINTYOU Histogram # Boxplot 187.5+* 3 | .******* 21 | .****************** 54 | .********************************** 100 +-----+ 167.5+************************ 71 *--+--* .******************* 55 +-----+ .************* 38 | .****** 18 | 147.5+** 6 | ----+----+----+----+----+----+---- * may represent up to 3 counts SAS システム 21 17:34 Saturday, November 10, 2007 Univariate Procedure Variable=KODUKAI Moments N 353 Sum Wgts 353 Mean 48257.79 Sum 17035000 Std Dev 49720.96 Variance 2.4722E9 Skewness 2.089807 Kurtosis 6.869491 USS 1.692E12 CSS 8.702E11 CV 103.032 Std Mean 2646.379 T:Mean=0 18.2354 Pr>|T| 0.0001 Num ^= 0 298 Num > 0 298 M(Sign) 149 Pr>=|M| 0.0001 Sgn Rank 22275.5 Pr>=|S| 0.0001 SAS システム 22 17:34 Saturday, November 10, 2007 Univariate Procedure Variable=KODUKAI Quantiles(Def=5) 100% Max 350000 99% 200000 75% Q3 60000 95% 150000 50% Med 30000 90% 120000 25% Q1 20000 10% 0 0% Min 0 5% 0 1% 0 Range 350000 Q3-Q1 40000 Mode 0 SAS システム 25 17:34 Saturday, November 10, 2007 Univariate Procedure Variable=KODUKAI Histogram # Boxplot 375000+* 1 * .* 2 * . .* 2 * .**** 18 0 .******** 38 0 .**************** 78 +-----+ 25000+******************************************* 214 *--+--* ----+----+----+----+----+----+----+----+--- * may represent up to 5 counts SAS システム 27 17:34 Saturday, November 10, 2007 SHINTYOU Cum. Cum. Midpoint Freq Freq Percent Percent | 146 | 2 2 0.55 0.55 150 |** 9 11 2.46 3.01 154 |*** 17 28 4.64 7.65 158 |******* 34 62 9.29 16.94 162 |********** 48 110 13.11 30.05 166 |********** 48 158 13.11 43.17 170 |*************** 77 235 21.04 64.21 174 |************** 71 306 19.40 83.61 178 |******* 36 342 9.84 93.44 182 |**** 19 361 5.19 98.63 186 |* 5 366 1.37 100.00 | ----+---+---+--- 20 40 60 SAS システム 31 17:34 Saturday, November 10, 2007 KODUKAI Cum. Cum. Midpoint Freq Freq Percent Percent | 15000 |*************************** 135 135 38.24 38.24 45000 |********************** 111 246 31.44 69.69 75000 |********* 45 291 12.75 82.44 105000 |***** 25 316 7.08 89.52 135000 |*** 14 330 3.97 93.48 165000 |*** 17 347 4.82 98.30 195000 |* 3 350 0.85 99.15 225000 | 0 350 0.00 99.15 255000 | 0 350 0.00 99.15 285000 | 0 350 0.00 99.15 315000 | 2 352 0.57 99.72 345000 | 1 353 0.28 100.00 | ----+---+---+---+---+---+--- 20 40 60 80 100 120 Frequency SAS システム 34 17:34 Saturday, November 10, 2007 --------------------------------- SEX=F -------------------------------- Variable N Mean Std Dev Minimum Maximum --------------------------------------------------------------------- SHINTYOU 121 159.0223140 5.3386560 145.0000000 171.0000000 TAIJYUU 85 48.6941176 4.6722675 35.0000000 60.0000000 KYOUI 44 82.9318182 3.8843618 70.0000000 90.0000000 KODUKAI 117 47846.15 44343.95 0 300000.00 TSUUWA 62 6640.06 4331.96 80.0000000 25000.00 --------------------------------------------------------------------- SAS システム 35 17:34 Saturday, November 10, 2007 --------------------------------- SEX=M -------------------------------- Variable N Mean Std Dev Minimum Maximum --------------------------------------------------------------------- SHINTYOU 244 172.1655738 5.3743909 156.0000000 186.0000000 TAIJYUU 244 62.1754098 7.9271146 46.0000000 100.0000000 KYOUI 71 88.0985915 9.6852685 46.0000000 112.0000000 KODUKAI 234 48350.43 52359.24 0 350000.00 TSUUWA 90 6367.76 4494.19 0 30000.00 --------------------------------------------------------------------- SAS システム 54 17:34 Saturday, November 10, 2007 -------------------------------- SEX=F --------------------------------- Univariate Procedure Variable=SHINTYOU Moments N 121 Sum Wgts 121 Mean 159.0223 Sum 19241.7 Std Dev 5.338656 Variance 28.50125 Skewness -0.22403 Kurtosis -0.31354 USS 3063280 CSS 3420.15 CV 3.357174 Std Mean 0.485332 T:Mean=0 327.6565 Pr>|T| 0.0001 Num ^= 0 121 Num > 0 121 M(Sign) 60.5 Pr>=|M| 0.0001 Sgn Rank 3690.5 Pr>=|S| 0.0001 SAS システム 56 17:34 Saturday, November 10, 2007 -------------------------------- SEX=F --------------------------------- Univariate Procedure Variable=SHINTYOU Quantiles(Def=5) 100% Max 171 99% 170 75% Q3 162.4 95% 167 50% Med 160 90% 166 25% Q1 156 10% 152 0% Min 145 5% 150 1% 146.7 Range 26 Q3-Q1 6.4 Mode 156 SAS システム 59 17:34 Saturday, November 10, 2007 -------------------------------- SEX=F --------------------------------- Univariate Procedure Variable=SHINTYOU Stem Leaf # Boxplot 17 001 3 | 16 555555666666677778 18 | 16 00000000000000011111222222222222333344444 41 +-----+ 15 55556666666666666677778888889999999 35 +--+--+ 15 001122223333333444 18 | 14 578899 6 0 ----+----+----+----+----+----+----+----+- Multiply Stem.Leaf by 10**+1 SAS システム 82 17:34 Saturday, November 10, 2007 -------------------------------- SEX=M --------------------------------- Univariate Procedure Variable=SHINTYOU Moments N 244 Sum Wgts 244 Mean 172.1656 Sum 42008.4 Std Dev 5.374391 Variance 28.88408 Skewness -0.0719 Kurtosis 0.175064 USS 7239419 CSS 7018.831 CV 3.121641 Std Mean 0.34406 T:Mean=0 500.3939 Pr>|T| 0.0001 Num ^= 0 244 Num > 0 244 M(Sign) 122 Pr>=|M| 0.0001 Sgn Rank 14945 Pr>=|S| 0.0001 SAS システム 84 17:34 Saturday, November 10, 2007 -------------------------------- SEX=M --------------------------------- Univariate Procedure Variable=SHINTYOU Quantiles(Def=5) 100% Max 186 99% 185 75% Q3 175 95% 181 50% Med 172 90% 179.9 25% Q1 169 10% 166 0% Min 156 5% 163 1% 160 Range 30 Q3-Q1 6 Mode 170 SAS システム 87 17:34 Saturday, November 10, 2007 -------------------------------- SEX=M --------------------------------- Univariate Procedure Variable=SHINTYOU Histogram # Boxplot 187.5+* 3 0 .******* 21 | .****************** 54 +-----+ 172.5+********************************* 98 *--+--* .***************** 51 +-----+ .***** 15 | 157.5+* 2 0 ----+----+----+----+----+----+--- * may represent up to 3 counts SAS システム 110 17:34 Saturday, November 10, 2007 Univariate Procedure Schematic Plots Variable=SHINTYOU 200 + | | 0 180 + | | | *--+--* | *--+--* | +-----+ 160 + *--+--* 0 | +-----+ 0 | 0 140 + ------------+-----------+-----------+----------- SEX F M SAS システム 111 17:34 Saturday, November 10, 2007 Univariate Procedure Schematic Plots Variable=TAIJYUU | 100 + * | 0 | *--+--* | *--+--* 50 + *--+--* +-----+ | 0 | 0 + ------------+-----------+-----------+----------- SEX F M SAS システム 112 17:34 Saturday, November 10, 2007 Univariate Procedure Schematic Plots Variable=KYOUI 150 + | | 0 100 + +-----+ | *--+--* *--+--* | 0 0 50 + * | | 0 + ------------+-----------+-----------+----------- SEX F M SAS システム 113 17:34 Saturday, November 10, 2007 Univariate Procedure Schematic Plots Variable=KODUKAI | 400000 + | * * | 200000 + 0 0 | +-----+ 0 0 | *--+--* *--+--* +--+--+ 0 + +-----+ +-----+ *-----* ------------+-----------+-----------+----------- SEX F M SAS システム 117 17:34 Saturday, November 10, 2007 -------------------------------- SEX=F --------------------------------- SHINTYOU Cum. Cum. Midpoint Freq Freq Percent Percent | 144 |* 1 1 0.83 0.83 147 |*** 3 4 2.48 3.31 150 |****** 6 10 4.96 8.26 153 |************** 14 24 11.57 19.83 156 |********************** 22 46 18.18 38.02 159 |**************************** 28 74 23.14 61.16 162 |********************* 21 95 17.36 78.51 165 |****************** 18 113 14.88 93.39 168 |***** 5 118 4.13 97.52 171 |*** 3 121 2.48 100.00 | -----+----+----+----+----+--- 5 10 15 20 25 Frequency SAS システム 122 17:34 Saturday, November 10, 2007 -------------------------------- SEX=M --------------------------------- SHINTYOU Cum. Cum. Midpoint Freq Freq Percent Percent | 156 |* 2 2 0.82 0.82 159 |** 5 7 2.05 2.87 162 |*** 8 15 3.28 6.15 165 |***** 13 28 5.33 11.48 168 |**************** 40 68 16.39 27.87 171 |************************** 65 133 26.64 54.51 174 |******************** 51 184 20.90 75.41 177 |*********** 28 212 11.48 86.89 180 |******** 21 233 8.61 95.49 183 |*** 8 241 3.28 98.77 186 |* 3 244 1.23 100.00 | SAS システム 128 17:34 Saturday, November 10, 2007 SEX SHINTYOU Cum. Cum. Midpoint Freq Freq Percent Percent | 146 | 0 0 0.00 0.00 150 | 0 0 0.00 0.00 154 | 0 0 0.00 0.00 158 | 0 0 0.00 0.00 162 | 0 0 0.00 0.00 166 | 0 0 0.00 0.00 170 | 1 1 0.27 0.27 174 | 0 1 0.00 0.27 178 | 0 1 0.00 0.27 182 | 0 1 0.00 0.27 186 | 0 1 0.00 0.27 | F 146 | 2 3 0.55 0.82 150 |** 9 12 2.46 3.28 154 |*** 17 29 4.64 7.92 158 |****** 32 61 8.74 16.67 162 |******* 35 96 9.56 26.23 166 |**** 22 118 6.01 32.24 170 |* 4 122 1.09 33.33 174 | 0 122 0.00 33.33 178 | 0 122 0.00 33.33 182 | 0 122 0.00 33.33 186 | 0 122 0.00 33.33 | M 146 | 0 122 0.00 33.33 150 | 0 122 0.00 33.33 154 | 0 122 0.00 33.33 158 | 2 124 0.55 33.88 162 |*** 13 137 3.55 37.43 166 |***** 26 163 7.10 44.54 170 |************** 72 235 19.67 64.21 174 |************** 71 306 19.40 83.61 178 |******* 36 342 9.84 93.44 182 |**** 19 361 5.19 98.63 186 |* 5 366 1.37 100.00 | ----+---+---+-- 20 40 60 Frequency SAS システム 137 17:34 Saturday, November 10, 2007 SEX KODUKAI Cum. Cum. Midpoint Freq Freq Percent Percent | 15000 | 1 1 0.28 0.28 45000 | 0 1 0.00 0.28 75000 | 0 1 0.00 0.28 105000 | 1 2 0.28 0.57 135000 | 0 2 0.00 0.57 165000 | 0 2 0.00 0.57 195000 | 0 2 0.00 0.57 225000 | 0 2 0.00 0.57 255000 | 0 2 0.00 0.57 285000 | 0 2 0.00 0.57 315000 | 0 2 0.00 0.57 345000 | 0 2 0.00 0.57 | F 15000 |******** 39 41 11.05 11.61 45000 |******** 42 83 11.90 23.51 75000 |**** 22 105 6.23 29.75 105000 |* 6 111 1.70 31.44 135000 |* 3 114 0.85 32.29 165000 | 2 116 0.57 32.86 195000 | 2 118 0.57 33.43 225000 | 0 118 0.00 33.43 255000 | 0 118 0.00 33.43 285000 | 0 118 0.00 33.43 315000 | 1 119 0.28 33.71 345000 | 0 119 0.00 33.71 | M 15000 |******************* 95 214 26.91 60.62 45000 |************** 69 283 19.55 80.17 75000 |***** 23 306 6.52 86.69 105000 |**** 18 324 5.10 91.78 135000 |** 11 335 3.12 94.90 165000 |*** 15 350 4.25 99.15 195000 | 1 351 0.28 99.43 225000 | 0 351 0.00 99.43 255000 | 0 351 0.00 99.43 285000 | 0 351 0.00 99.43 315000 | 1 352 0.28 99.72 345000 | 1 353 0.28 100.00 | ----+---+---+---+--- 20 40 60 80 Frequency