前回までに分布特性を把握するためのいくつかの指標を説明し、 その使い方や注意点を喚起した。またグループ分けが有用なことも説明した。 今回は、単純集計としてよく利用される頻度集計やクロス集計の方法を紹介する。
/* Lesson 09-1 */ /* File Name = les0901.sas 06/14/07 */ data gakusei; infile 'all07ae.prn' firstobs=2; input sex $ shintyou taijyuu kyoui jitaku $ kodukai carryer $ tsuuwa; proc print data=gakusei(obs=5); run; : proc freq data=gakusei; : 頻度を算出 tables sex jitaku carryer; : 一変量ごとに run; : proc freq data=gakusei; : 頻度を算出 tables sex*jitaku; : 二変量の組み合わせで tables sex*carryer; : tables jitaku*carryer; : run; :
SAS システム 1 12:44 Wednesday, June 13, 2007 OBS SEX SHINTYOU TAIJYUU KYOUI JITAKU KODUKAI CARRYER TSUUWA 1 F 145.0 38 . J 10000 . 2 F 146.7 41 85 J 10000 Vodafone 6000 3 F 148.0 42 . J 50000 . 4 F 148.0 43 80 J 50000 DoCoMo 4000 5 F 148.9 . . J 60000 . SAS システム 2 12:44 Wednesday, June 13, 2007 Cumulative Cumulative SEX Frequency Percent Frequency Percent ------------------------------------------------- F 124 33.5 124 33.5 M 246 66.5 370 100.0 Frequency Missing = 5 Cumulative Cumulative JITAKU Frequency Percent Frequency Percent ---------------------------------------------------- G 120 37.4 120 37.4 J 201 62.6 321 100.0 Frequency Missing = 54 Cumulative Cumulative CARRYER Frequency Percent Frequency Percent ------------------------------------------------------ DDIp 2 1.4 2 1.4 DoCoMo 60 40.8 62 42.2 J-PHONE 10 6.8 72 49.0 KDDI 1 0.7 73 49.7 No 5 3.4 78 53.1 Vodafone 20 13.6 98 66.7 Willcom 1 0.7 99 67.3 au 39 26.5 138 93.9 au+willc 1 0.7 139 94.6 docomo 5 3.4 144 98.0 docomo+w 1 0.7 145 98.6 softbank 1 0.7 146 99.3 vodafone 1 0.7 147 100.0 Frequency Missing = 228 SAS システム 6 12:44 Wednesday, June 13, 2007 TABLE OF SEX BY JITAKU SEX JITAKU Frequency| Percent | Row Pct | Col Pct |G |J | Total ---------+--------+--------+ F | 36 | 70 | 106 | 11.29 | 21.94 | 33.23 | 33.96 | 66.04 | | 30.25 | 35.00 | ---------+--------+--------+ M | 83 | 130 | 213 | 26.02 | 40.75 | 66.77 | 38.97 | 61.03 | | 69.75 | 65.00 | ---------+--------+--------+ Total 119 200 319 37.30 62.70 100.00 Frequency Missing = 56 SAS システム 9 12:44 Wednesday, June 13, 2007 TABLE OF SEX BY CARRYER SEX CARRYER Frequency| Percent | Row Pct | Col Pct |DDIp |DoCoMo |J-PHONE |KDDI |No | Total ---------+--------+--------+--------+--------+--------+ F | 1 | 25 | 4 | 0 | 1 | 56 | 0.68 | 17.12 | 2.74 | 0.00 | 0.68 | 38.36 | 1.79 | 44.64 | 7.14 | 0.00 | 1.79 | | 50.00 | 41.67 | 44.44 | 0.00 | 20.00 | ---------+--------+--------+--------+--------+--------+ M | 1 | 35 | 5 | 1 | 4 | 90 | 0.68 | 23.97 | 3.42 | 0.68 | 2.74 | 61.64 | 1.11 | 38.89 | 5.56 | 1.11 | 4.44 | | 50.00 | 58.33 | 55.56 | 100.00 | 80.00 | ---------+--------+--------+--------+--------+--------+ Total 2 60 9 1 5 146 1.37 41.10 6.16 0.68 3.42 100.00 (Continued) SAS システム 11 12:44 Wednesday, June 13, 2007 TABLE OF SEX BY CARRYER SEX CARRYER Frequency| Percent | Row Pct | Col Pct |Vodafone|Willcom |au |au+willc|docomo | Total ---------+--------+--------+--------+--------+--------+ F | 9 | 1 | 12 | 1 | 1 | 56 | 6.16 | 0.68 | 8.22 | 0.68 | 0.68 | 38.36 | 16.07 | 1.79 | 21.43 | 1.79 | 1.79 | | 45.00 | 100.00 | 30.77 | 100.00 | 20.00 | ---------+--------+--------+--------+--------+--------+ M | 11 | 0 | 27 | 0 | 4 | 90 | 7.53 | 0.00 | 18.49 | 0.00 | 2.74 | 61.64 | 12.22 | 0.00 | 30.00 | 0.00 | 4.44 | | 55.00 | 0.00 | 69.23 | 0.00 | 80.00 | ---------+--------+--------+--------+--------+--------+ Total 20 1 39 1 5 146 13.70 0.68 26.71 0.68 3.42 100.00 (Continued) SAS システム 13 12:44 Wednesday, June 13, 2007 TABLE OF SEX BY CARRYER SEX CARRYER Frequency| Percent | Row Pct | Col Pct |docomo+w|softbank|vodafone| Total ---------+--------+--------+--------+ F | 0 | 1 | 0 | 56 | 0.00 | 0.68 | 0.00 | 38.36 | 0.00 | 1.79 | 0.00 | | 0.00 | 100.00 | 0.00 | ---------+--------+--------+--------+ M | 1 | 0 | 1 | 90 | 0.68 | 0.00 | 0.68 | 61.64 | 1.11 | 0.00 | 1.11 | | 100.00 | 0.00 | 100.00 | ---------+--------+--------+--------+ Total 1 1 1 146 0.68 0.68 0.68 100.00 Frequency Missing = 229 SAS システム 16 12:44 Wednesday, June 13, 2007 TABLE OF JITAKU BY CARRYER JITAKU CARRYER Frequency| Percent | Row Pct | Col Pct |DDIp |DoCoMo |J-PHONE |KDDI |No | Total ---------+--------+--------+--------+--------+--------+ G | 1 | 21 | 4 | 1 | 0 | 47 | 0.79 | 16.67 | 3.17 | 0.79 | 0.00 | 37.30 | 2.13 | 44.68 | 8.51 | 2.13 | 0.00 | | 100.00 | 41.18 | 44.44 | 100.00 | 0.00 | ---------+--------+--------+--------+--------+--------+ J | 0 | 30 | 5 | 0 | 4 | 79 | 0.00 | 23.81 | 3.97 | 0.00 | 3.17 | 62.70 | 0.00 | 37.97 | 6.33 | 0.00 | 5.06 | | 0.00 | 58.82 | 55.56 | 0.00 | 100.00 | ---------+--------+--------+--------+--------+--------+ Total 1 51 9 1 4 126 0.79 40.48 7.14 0.79 3.17 100.00 (Continued) SAS システム 18 12:44 Wednesday, June 13, 2007 TABLE OF JITAKU BY CARRYER JITAKU CARRYER Frequency| Percent | Row Pct | Col Pct |Vodafone|Willcom |au |au+willc|docomo | Total ---------+--------+--------+--------+--------+--------+ G | 4 | 0 | 12 | 0 | 2 | 47 | 3.17 | 0.00 | 9.52 | 0.00 | 1.59 | 37.30 | 8.51 | 0.00 | 25.53 | 0.00 | 4.26 | | 23.53 | . | 35.29 | 0.00 | 40.00 | ---------+--------+--------+--------+--------+--------+ J | 13 | 0 | 22 | 1 | 3 | 79 | 10.32 | 0.00 | 17.46 | 0.79 | 2.38 | 62.70 | 16.46 | 0.00 | 27.85 | 1.27 | 3.80 | | 76.47 | . | 64.71 | 100.00 | 60.00 | ---------+--------+--------+--------+--------+--------+ Total 17 0 34 1 5 126 13.49 0.00 26.98 0.79 3.97 100.00 (Continued) SAS システム 20 12:44 Wednesday, June 13, 2007 TABLE OF JITAKU BY CARRYER JITAKU CARRYER Frequency| Percent | Row Pct | Col Pct |docomo+w|softbank|vodafone| Total ---------+--------+--------+--------+ G | 1 | 1 | 0 | 47 | 0.79 | 0.79 | 0.00 | 37.30 | 2.13 | 2.13 | 0.00 | | 100.00 | 100.00 | 0.00 | ---------+--------+--------+--------+ J | 0 | 0 | 1 | 79 | 0.00 | 0.00 | 0.79 | 62.70 | 0.00 | 0.00 | 1.27 | | 0.00 | 0.00 | 100.00 | ---------+--------+--------+--------+ Total 1 1 1 126 0.79 0.79 0.79 100.00 Frequency Missing = 249
≪前略≫ if carryer="au+willc" then carryer="au+Willc"; if carryer="docomo" then carryer="DoCoMo"; if carryer="docomo+w" then carryer="DoCoMo+W"; if carryer="vodafone" then carryer="Vodafone"; ≪後略≫
≪前略≫ proc freq data=gakusei order=freq; : 頻度の高いもの順 tables sex jitaku carryer; : run; : : proc freq data=gakusei order=freq; : 頻度の高いもの順 tables sex*jitaku; : tables sex*carryer; : tables jitaku*carryer; : run; : ≪後略≫
/* Lesson 09-4 */ /* File Name = les0904.sas 06/14/07 */ data gakusei; infile 'all07ae.prn' firstobs=2; input sex $ shintyou taijyuu kyoui jitaku $ kodukai carryer $ tsuuwa; proc format; : 階級を作る。class shintyou の意 value clshint low-<150=' -149' : 階級の定義 1 150-<160='150-159' : 2 160-<170='160-169' : 3 170-<180='170-179' : 4 180-high='180- ' : 5 other ='missing'; : 6 run; : proc print data=gakusei(obs=5); run; proc freq data=gakusei; : 頻度を算出 tables shintyou; : 一変量ごとに format shintyou clshint.; : 連続変量をグループ化することの指定 run; : : proc freq data=gakusei; : 頻度を算出 tables sex*shintyou; : 二変量の組合わせで format shintyou clshint.; : 連続変量をグループ化することの指定 run; : : proc sort data=gakusei; : 今までの方法で実現しようとすると by sex; : run; : proc freq data=gakusei; : tables shintyou; : format shintyou clshint.; : 連続変量をグループ化することの指定 by sex; : 性別ごとに run; :
SAS システム 2 12:44 Wednesday, June 13, 2007 Cumulative Cumulative SHINTYOU Frequency Percent Frequency Percent ------------------------------------------------------ -149 6 1.7 6 1.7 150-159 54 15.0 60 16.7 160-169 123 34.2 183 50.8 170-179 153 42.5 336 93.3 180- 24 6.7 360 100.0 Frequency Missing = 15 SAS システム 3 12:44 Wednesday, June 13, 2007 TABLE OF SEX BY SHINTYOU SEX SHINTYOU Frequency| Percent | Row Pct | Col Pct | -149 |150-159 |160-169 |170-179 |180- | Total ---------+--------+--------+--------+--------+--------+ F | 6 | 52 | 57 | 2 | 0 | 117 | 1.67 | 14.48 | 15.88 | 0.56 | 0.00 | 32.59 | 5.13 | 44.44 | 48.72 | 1.71 | 0.00 | | 100.00 | 96.30 | 46.72 | 1.31 | 0.00 | ---------+--------+--------+--------+--------+--------+ M | 0 | 2 | 65 | 151 | 24 | 242 | 0.00 | 0.56 | 18.11 | 42.06 | 6.69 | 67.41 | 0.00 | 0.83 | 26.86 | 62.40 | 9.92 | | 0.00 | 3.70 | 53.28 | 98.69 | 100.00 | ---------+--------+--------+--------+--------+--------+ Total 6 54 122 153 24 359 1.67 15.04 33.98 42.62 6.69 100.00 Frequency Missing = 16 SAS システム 6 12:44 Wednesday, June 13, 2007 ------------------------------- SEX=' ' -------------------------------- Cumulative Cumulative SHINTYOU Frequency Percent Frequency Percent ------------------------------------------------------ 160-169 1 100.0 1 100.0 Frequency Missing = 4 SAS システム 7 12:44 Wednesday, June 13, 2007 -------------------------------- SEX=F --------------------------------- Cumulative Cumulative SHINTYOU Frequency Percent Frequency Percent ------------------------------------------------------ -149 6 5.1 6 5.1 150-159 52 44.4 58 49.6 160-169 57 48.7 115 98.3 170-179 2 1.7 117 100.0 Frequency Missing = 7 SAS システム 8 12:44 Wednesday, June 13, 2007 -------------------------------- SEX=M --------------------------------- Cumulative Cumulative SHINTYOU Frequency Percent Frequency Percent ------------------------------------------------------ 150-159 2 0.8 2 0.8 160-169 65 26.9 67 27.7 170-179 151 62.4 218 90.1 180- 24 9.9 242 100.0 Frequency Missing = 4
/* Lesson 09-5 */ /* File Name = les0905.sas 06/14/07 */ data gakusei; infile 'all07ae.prn' firstobs=2; input sex $ shintyou taijyuu kyoui jitaku $ kodukai carryer $ tsuuwa; proc format; value clshint low-<150=' -149' 150-<160='150-159' 160-<170='160-169' 170-<180='170-179' 180-high='180- ' other ='missing'; run; proc print data=gakusei(obs=5); run; proc tabulate data=gakusei; : 要約統計量の表の作成 class sex jitaku; : 特性変数であることの宣言 var kodukai; : 集計する変量名 tables kodukai*(n mean std),sex*jitaku; : 表示内容、分類変量名 run; : proc tabulate data=gakusei; : class shintyou sex; : var taijyuu; : tables taijyuu*(n mean std),shintyou*sex; : format shintyou clshint.; : 連続変量をグループ化することの指定 run; :
SAS システム 2 12:44 Wednesday, June 13, 2007 ---------------------------------------------------------------------- | | SEX | | |---------------------------------------------------| | | F | M | | |-------------------------+-------------------------| | | JITAKU | JITAKU | | |-------------------------+-------------------------| | | G | J | G | J | |----------------+------------+------------+------------+------------| |KODUKAI|N | 34.00| 68.00| 82.00| 126.00| | |--------+------------+------------+------------+------------| | |MEAN | 77647.06| 35110.29| 86256.10| 25777.78| | |--------+------------+------------+------------+------------| | |STD | 58390.47| 31307.91| 59470.88| 32858.21| ---------------------------------------------------------------------- SAS システム 3 12:44 Wednesday, June 13, 2007 ---------------------------------------------------------------------- | | SHINTYOU | | |---------------------------------------------------| | | -149 | 150-159 | 160-169 | | |------------+-------------------------+------------| | | SEX | SEX | SEX | | |------------+-------------------------+------------| | | F | F | M | F | |----------------+------------+------------+------------+------------| |TAIJYUU|N | 5.00| 40.00| 2.00| 38.00| | |--------+------------+------------+------------+------------| | |MEAN | 41.80| 47.31| 54.50| 51.07| | |--------+------------+------------+------------+------------| | |STD | 2.59| 4.61| 9.19| 3.47| ---------------------------------------------------------------------- (CONTINUED) SAS システム 4 12:44 Wednesday, June 13, 2007 ---------------------------------------------------------------------- | | SHINTYOU | | |---------------------------------------------------| | | 160-169 | 170-179 | 180- | | |------------+-------------------------+------------| | | SEX | SEX | SEX | | |------------+-------------------------+------------| | | M | F | M | M | |----------------+------------+------------+------------+------------| |TAIJYUU|N | 65.00| 0.00| 151.00| 24.00| | |--------+------------+------------+------------+------------| | |MEAN | 58.50| .| 63.10| 67.56| | |--------+------------+------------+------------+------------| | |STD | 7.35| .| 7.54| 7.38| ----------------------------------------------------------------------
data mon2007; infile 'd:\home\mon05d.csv' dlm=',' firstobs=2 truncover; missover dsd ; input No $ Univ : $30. SName : $40. Faculty : $50. Dept : $50. Center1 : $8. Center2 : $8. Sel1 : $8. Sel2 : $8. Book1 : $10. Book2 : $10. Vol0 VolS VolT ZenKou $ ScoreS ScoreT KoKouSi ;
data mon2007; infile 'd:\home\mon05e.txt' dlm='09'x firstobs=2 truncover;
data math; infile 'foo.dat' lrecl=230;
data math; infile 'foo.dat' lrecl=230 truncover;
input kamoku $ 2 kesseki $ 3 k_code $ 10-11 t_score 12-14 s_scor01 103-104 s_scor02 105-106 s_scor03 107-108 s_scor04 109-110 ;
data math; infile 'foo.dat' firstobs=4;