前回までに分布特性を把握するためのいくつかの指標を説明し、 その使い方や注意点を喚起した。またグループ分けが有用なことも説明した。 今回は、単純集計としてよく利用される頻度集計やクロス集計の方法を紹介する。
/* Lesson 09-1 */
/* File Name = les0901.sas 06/14/07 */
data gakusei;
infile 'all07ae.prn'
firstobs=2;
input sex $ shintyou taijyuu kyoui
jitaku $ kodukai carryer $ tsuuwa;
proc print data=gakusei(obs=5);
run;
:
proc freq data=gakusei; : 頻度を算出
tables sex jitaku carryer; : 一変量ごとに
run; :
proc freq data=gakusei; : 頻度を算出
tables sex*jitaku; : 二変量の組み合わせで
tables sex*carryer; :
tables jitaku*carryer; :
run; :
SAS システム 1
12:44 Wednesday, June 13, 2007
OBS SEX SHINTYOU TAIJYUU KYOUI JITAKU KODUKAI CARRYER TSUUWA
1 F 145.0 38 . J 10000 .
2 F 146.7 41 85 J 10000 Vodafone 6000
3 F 148.0 42 . J 50000 .
4 F 148.0 43 80 J 50000 DoCoMo 4000
5 F 148.9 . . J 60000 .
SAS システム 2
12:44 Wednesday, June 13, 2007
Cumulative Cumulative
SEX Frequency Percent Frequency Percent
-------------------------------------------------
F 124 33.5 124 33.5
M 246 66.5 370 100.0
Frequency Missing = 5
Cumulative Cumulative
JITAKU Frequency Percent Frequency Percent
----------------------------------------------------
G 120 37.4 120 37.4
J 201 62.6 321 100.0
Frequency Missing = 54
Cumulative Cumulative
CARRYER Frequency Percent Frequency Percent
------------------------------------------------------
DDIp 2 1.4 2 1.4
DoCoMo 60 40.8 62 42.2
J-PHONE 10 6.8 72 49.0
KDDI 1 0.7 73 49.7
No 5 3.4 78 53.1
Vodafone 20 13.6 98 66.7
Willcom 1 0.7 99 67.3
au 39 26.5 138 93.9
au+willc 1 0.7 139 94.6
docomo 5 3.4 144 98.0
docomo+w 1 0.7 145 98.6
softbank 1 0.7 146 99.3
vodafone 1 0.7 147 100.0
Frequency Missing = 228
SAS システム 6
12:44 Wednesday, June 13, 2007
TABLE OF SEX BY JITAKU
SEX JITAKU
Frequency|
Percent |
Row Pct |
Col Pct |G |J | Total
---------+--------+--------+
F | 36 | 70 | 106
| 11.29 | 21.94 | 33.23
| 33.96 | 66.04 |
| 30.25 | 35.00 |
---------+--------+--------+
M | 83 | 130 | 213
| 26.02 | 40.75 | 66.77
| 38.97 | 61.03 |
| 69.75 | 65.00 |
---------+--------+--------+
Total 119 200 319
37.30 62.70 100.00
Frequency Missing = 56
SAS システム 9
12:44 Wednesday, June 13, 2007
TABLE OF SEX BY CARRYER
SEX CARRYER
Frequency|
Percent |
Row Pct |
Col Pct |DDIp |DoCoMo |J-PHONE |KDDI |No | Total
---------+--------+--------+--------+--------+--------+
F | 1 | 25 | 4 | 0 | 1 | 56
| 0.68 | 17.12 | 2.74 | 0.00 | 0.68 | 38.36
| 1.79 | 44.64 | 7.14 | 0.00 | 1.79 |
| 50.00 | 41.67 | 44.44 | 0.00 | 20.00 |
---------+--------+--------+--------+--------+--------+
M | 1 | 35 | 5 | 1 | 4 | 90
| 0.68 | 23.97 | 3.42 | 0.68 | 2.74 | 61.64
| 1.11 | 38.89 | 5.56 | 1.11 | 4.44 |
| 50.00 | 58.33 | 55.56 | 100.00 | 80.00 |
---------+--------+--------+--------+--------+--------+
Total 2 60 9 1 5 146
1.37 41.10 6.16 0.68 3.42 100.00
(Continued)
SAS システム 11
12:44 Wednesday, June 13, 2007
TABLE OF SEX BY CARRYER
SEX CARRYER
Frequency|
Percent |
Row Pct |
Col Pct |Vodafone|Willcom |au |au+willc|docomo | Total
---------+--------+--------+--------+--------+--------+
F | 9 | 1 | 12 | 1 | 1 | 56
| 6.16 | 0.68 | 8.22 | 0.68 | 0.68 | 38.36
| 16.07 | 1.79 | 21.43 | 1.79 | 1.79 |
| 45.00 | 100.00 | 30.77 | 100.00 | 20.00 |
---------+--------+--------+--------+--------+--------+
M | 11 | 0 | 27 | 0 | 4 | 90
| 7.53 | 0.00 | 18.49 | 0.00 | 2.74 | 61.64
| 12.22 | 0.00 | 30.00 | 0.00 | 4.44 |
| 55.00 | 0.00 | 69.23 | 0.00 | 80.00 |
---------+--------+--------+--------+--------+--------+
Total 20 1 39 1 5 146
13.70 0.68 26.71 0.68 3.42 100.00
(Continued)
SAS システム 13
12:44 Wednesday, June 13, 2007
TABLE OF SEX BY CARRYER
SEX CARRYER
Frequency|
Percent |
Row Pct |
Col Pct |docomo+w|softbank|vodafone| Total
---------+--------+--------+--------+
F | 0 | 1 | 0 | 56
| 0.00 | 0.68 | 0.00 | 38.36
| 0.00 | 1.79 | 0.00 |
| 0.00 | 100.00 | 0.00 |
---------+--------+--------+--------+
M | 1 | 0 | 1 | 90
| 0.68 | 0.00 | 0.68 | 61.64
| 1.11 | 0.00 | 1.11 |
| 100.00 | 0.00 | 100.00 |
---------+--------+--------+--------+
Total 1 1 1 146
0.68 0.68 0.68 100.00
Frequency Missing = 229
SAS システム 16
12:44 Wednesday, June 13, 2007
TABLE OF JITAKU BY CARRYER
JITAKU CARRYER
Frequency|
Percent |
Row Pct |
Col Pct |DDIp |DoCoMo |J-PHONE |KDDI |No | Total
---------+--------+--------+--------+--------+--------+
G | 1 | 21 | 4 | 1 | 0 | 47
| 0.79 | 16.67 | 3.17 | 0.79 | 0.00 | 37.30
| 2.13 | 44.68 | 8.51 | 2.13 | 0.00 |
| 100.00 | 41.18 | 44.44 | 100.00 | 0.00 |
---------+--------+--------+--------+--------+--------+
J | 0 | 30 | 5 | 0 | 4 | 79
| 0.00 | 23.81 | 3.97 | 0.00 | 3.17 | 62.70
| 0.00 | 37.97 | 6.33 | 0.00 | 5.06 |
| 0.00 | 58.82 | 55.56 | 0.00 | 100.00 |
---------+--------+--------+--------+--------+--------+
Total 1 51 9 1 4 126
0.79 40.48 7.14 0.79 3.17 100.00
(Continued)
SAS システム 18
12:44 Wednesday, June 13, 2007
TABLE OF JITAKU BY CARRYER
JITAKU CARRYER
Frequency|
Percent |
Row Pct |
Col Pct |Vodafone|Willcom |au |au+willc|docomo | Total
---------+--------+--------+--------+--------+--------+
G | 4 | 0 | 12 | 0 | 2 | 47
| 3.17 | 0.00 | 9.52 | 0.00 | 1.59 | 37.30
| 8.51 | 0.00 | 25.53 | 0.00 | 4.26 |
| 23.53 | . | 35.29 | 0.00 | 40.00 |
---------+--------+--------+--------+--------+--------+
J | 13 | 0 | 22 | 1 | 3 | 79
| 10.32 | 0.00 | 17.46 | 0.79 | 2.38 | 62.70
| 16.46 | 0.00 | 27.85 | 1.27 | 3.80 |
| 76.47 | . | 64.71 | 100.00 | 60.00 |
---------+--------+--------+--------+--------+--------+
Total 17 0 34 1 5 126
13.49 0.00 26.98 0.79 3.97 100.00
(Continued)
SAS システム 20
12:44 Wednesday, June 13, 2007
TABLE OF JITAKU BY CARRYER
JITAKU CARRYER
Frequency|
Percent |
Row Pct |
Col Pct |docomo+w|softbank|vodafone| Total
---------+--------+--------+--------+
G | 1 | 1 | 0 | 47
| 0.79 | 0.79 | 0.00 | 37.30
| 2.13 | 2.13 | 0.00 |
| 100.00 | 100.00 | 0.00 |
---------+--------+--------+--------+
J | 0 | 0 | 1 | 79
| 0.00 | 0.00 | 0.79 | 62.70
| 0.00 | 0.00 | 1.27 |
| 0.00 | 0.00 | 100.00 |
---------+--------+--------+--------+
Total 1 1 1 126
0.79 0.79 0.79 100.00
Frequency Missing = 249
≪前略≫ if carryer="au+willc" then carryer="au+Willc"; if carryer="docomo" then carryer="DoCoMo"; if carryer="docomo+w" then carryer="DoCoMo+W"; if carryer="vodafone" then carryer="Vodafone"; ≪後略≫
≪前略≫
proc freq data=gakusei order=freq; : 頻度の高いもの順
tables sex jitaku carryer; :
run; :
:
proc freq data=gakusei order=freq; : 頻度の高いもの順
tables sex*jitaku; :
tables sex*carryer; :
tables jitaku*carryer; :
run; :
≪後略≫
/* Lesson 09-4 */
/* File Name = les0904.sas 06/14/07 */
data gakusei;
infile 'all07ae.prn'
firstobs=2;
input sex $ shintyou taijyuu kyoui
jitaku $ kodukai carryer $ tsuuwa;
proc format; : 階級を作る。class shintyou の意
value clshint low-<150=' -149' : 階級の定義 1
150-<160='150-159' : 2
160-<170='160-169' : 3
170-<180='170-179' : 4
180-high='180- ' : 5
other ='missing'; : 6
run; :
proc print data=gakusei(obs=5);
run;
proc freq data=gakusei; : 頻度を算出
tables shintyou; : 一変量ごとに
format shintyou clshint.; : 連続変量をグループ化することの指定
run; :
:
proc freq data=gakusei; : 頻度を算出
tables sex*shintyou; : 二変量の組合わせで
format shintyou clshint.; : 連続変量をグループ化することの指定
run; :
:
proc sort data=gakusei; : 今までの方法で実現しようとすると
by sex; :
run; :
proc freq data=gakusei; :
tables shintyou; :
format shintyou clshint.; : 連続変量をグループ化することの指定
by sex; : 性別ごとに
run; :
SAS システム 2
12:44 Wednesday, June 13, 2007
Cumulative Cumulative
SHINTYOU Frequency Percent Frequency Percent
------------------------------------------------------
-149 6 1.7 6 1.7
150-159 54 15.0 60 16.7
160-169 123 34.2 183 50.8
170-179 153 42.5 336 93.3
180- 24 6.7 360 100.0
Frequency Missing = 15
SAS システム 3
12:44 Wednesday, June 13, 2007
TABLE OF SEX BY SHINTYOU
SEX SHINTYOU
Frequency|
Percent |
Row Pct |
Col Pct | -149 |150-159 |160-169 |170-179 |180- | Total
---------+--------+--------+--------+--------+--------+
F | 6 | 52 | 57 | 2 | 0 | 117
| 1.67 | 14.48 | 15.88 | 0.56 | 0.00 | 32.59
| 5.13 | 44.44 | 48.72 | 1.71 | 0.00 |
| 100.00 | 96.30 | 46.72 | 1.31 | 0.00 |
---------+--------+--------+--------+--------+--------+
M | 0 | 2 | 65 | 151 | 24 | 242
| 0.00 | 0.56 | 18.11 | 42.06 | 6.69 | 67.41
| 0.00 | 0.83 | 26.86 | 62.40 | 9.92 |
| 0.00 | 3.70 | 53.28 | 98.69 | 100.00 |
---------+--------+--------+--------+--------+--------+
Total 6 54 122 153 24 359
1.67 15.04 33.98 42.62 6.69 100.00
Frequency Missing = 16
SAS システム 6
12:44 Wednesday, June 13, 2007
------------------------------- SEX=' ' --------------------------------
Cumulative Cumulative
SHINTYOU Frequency Percent Frequency Percent
------------------------------------------------------
160-169 1 100.0 1 100.0
Frequency Missing = 4
SAS システム 7
12:44 Wednesday, June 13, 2007
-------------------------------- SEX=F ---------------------------------
Cumulative Cumulative
SHINTYOU Frequency Percent Frequency Percent
------------------------------------------------------
-149 6 5.1 6 5.1
150-159 52 44.4 58 49.6
160-169 57 48.7 115 98.3
170-179 2 1.7 117 100.0
Frequency Missing = 7
SAS システム 8
12:44 Wednesday, June 13, 2007
-------------------------------- SEX=M ---------------------------------
Cumulative Cumulative
SHINTYOU Frequency Percent Frequency Percent
------------------------------------------------------
150-159 2 0.8 2 0.8
160-169 65 26.9 67 27.7
170-179 151 62.4 218 90.1
180- 24 9.9 242 100.0
Frequency Missing = 4
/* Lesson 09-5 */
/* File Name = les0905.sas 06/14/07 */
data gakusei;
infile 'all07ae.prn'
firstobs=2;
input sex $ shintyou taijyuu kyoui
jitaku $ kodukai carryer $ tsuuwa;
proc format;
value clshint low-<150=' -149'
150-<160='150-159'
160-<170='160-169'
170-<180='170-179'
180-high='180- '
other ='missing';
run;
proc print data=gakusei(obs=5);
run;
proc tabulate data=gakusei; : 要約統計量の表の作成
class sex jitaku; : 特性変数であることの宣言
var kodukai; : 集計する変量名
tables kodukai*(n mean std),sex*jitaku; : 表示内容、分類変量名
run; :
proc tabulate data=gakusei; :
class shintyou sex; :
var taijyuu; :
tables taijyuu*(n mean std),shintyou*sex; :
format shintyou clshint.; : 連続変量をグループ化することの指定
run; :
SAS システム 2
12:44 Wednesday, June 13, 2007
----------------------------------------------------------------------
| | SEX |
| |---------------------------------------------------|
| | F | M |
| |-------------------------+-------------------------|
| | JITAKU | JITAKU |
| |-------------------------+-------------------------|
| | G | J | G | J |
|----------------+------------+------------+------------+------------|
|KODUKAI|N | 34.00| 68.00| 82.00| 126.00|
| |--------+------------+------------+------------+------------|
| |MEAN | 77647.06| 35110.29| 86256.10| 25777.78|
| |--------+------------+------------+------------+------------|
| |STD | 58390.47| 31307.91| 59470.88| 32858.21|
----------------------------------------------------------------------
SAS システム 3
12:44 Wednesday, June 13, 2007
----------------------------------------------------------------------
| | SHINTYOU |
| |---------------------------------------------------|
| | -149 | 150-159 | 160-169 |
| |------------+-------------------------+------------|
| | SEX | SEX | SEX |
| |------------+-------------------------+------------|
| | F | F | M | F |
|----------------+------------+------------+------------+------------|
|TAIJYUU|N | 5.00| 40.00| 2.00| 38.00|
| |--------+------------+------------+------------+------------|
| |MEAN | 41.80| 47.31| 54.50| 51.07|
| |--------+------------+------------+------------+------------|
| |STD | 2.59| 4.61| 9.19| 3.47|
----------------------------------------------------------------------
(CONTINUED)
SAS システム 4
12:44 Wednesday, June 13, 2007
----------------------------------------------------------------------
| | SHINTYOU |
| |---------------------------------------------------|
| | 160-169 | 170-179 | 180- |
| |------------+-------------------------+------------|
| | SEX | SEX | SEX |
| |------------+-------------------------+------------|
| | M | F | M | M |
|----------------+------------+------------+------------+------------|
|TAIJYUU|N | 65.00| 0.00| 151.00| 24.00|
| |--------+------------+------------+------------+------------|
| |MEAN | 58.50| .| 63.10| 67.56|
| |--------+------------+------------+------------+------------|
| |STD | 7.35| .| 7.54| 7.38|
----------------------------------------------------------------------
data mon2007;
infile 'd:\home\mon05d.csv' dlm=','
firstobs=2
truncover;
missover
dsd
;
input No $ Univ : $30. SName : $40. Faculty : $50. Dept : $50.
Center1 : $8. Center2 : $8. Sel1 : $8. Sel2 : $8.
Book1 : $10. Book2 : $10.
Vol0 VolS VolT
ZenKou $ ScoreS ScoreT KoKouSi
;
data mon2007;
infile 'd:\home\mon05e.txt' dlm='09'x
firstobs=2
truncover;
data math; infile 'foo.dat' lrecl=230;
data math; infile 'foo.dat' lrecl=230 truncover;
input
kamoku $ 2
kesseki $ 3
k_code $ 10-11
t_score 12-14
s_scor01 103-104
s_scor02 105-106
s_scor03 107-108
s_scor04 109-110
;
data math; infile 'foo.dat' firstobs=4;