実験計画法の基本的な考え方は、要因間のばらつきと、 外乱(誤差)のばらつきを比較して、要因間のばらつきが大きいければ 成果物(の量)の違いは「要因による意味のある差(有意)」が存在すると 判断することである。 その際に用いるアイディアとして、全体のばらつきを、 要因間のばらつきと外乱(誤差)のばらつきに算術的に分解出来ることである。 ST=SA+Se
ばらつきの指標が分散なので、「分散分析」を行い判断することになる。 大きさの比較として比を用い、分散の比はF分布に従うことを利用して検定を行う。 「帰無仮説H0: 要因間に差がない」とした場合の検定。
/* Lesson 15-01 */
/* File Name = les1501.sas 02/02/21 */
options nocenter linesize=78 pagesize=30;
options locale='en_US';
/* options locale='ja_JP'; */
proc printto print = 'StatM20/les1501-Results.txt' new;
ods listing gpath='StatM20/SAS_ODS15';
data polymer;
infile 'StatM20/table811.csv'
firstobs=2
dlm=',' dsd
encoding=sjis termstr=crlf
;
input A R Y;
proc print data=polymer;
run;
proc glm data=polymer; : 実験計画法
class A; : 水準の変量
model Y = A; : モデル
means A / tukey; : 水準間の比較(多重比較)
run;
Monday, February 1, 2021 06:21:52 PM 63
Obs A R Y
1 1 1 10.8
2 1 2 9.9
3 1 3 10.7
4 1 4 10.4
5 1 5 9.7
6 2 1 10.7
7 2 2 10.6
8 2 3 11.0
9 2 4 10.8
10 2 5 10.9
11 3 1 11.9
12 3 2 11.2
13 3 3 11.0
14 3 4 11.1
15 3 5 11.3
16 4 1 11.4
17 4 2 10.7
18 4 3 10.9
19 4 4 11.3
20 4 5 11.7
Monday, February 1, 2021 06:21:52 PM 64
The GLM Procedure
Class Level Information
Class Levels Values
A 4 1 2 3 4
Number of Observations Read 20
Number of Observations Used 20
Monday, February 1, 2021 06:21:52 PM 65
The GLM Procedure
Dependent Variable: Y
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 3 3.10000000 1.03333333 7.58 0.0022
Error 16 2.18000000 0.13625000
Corrected Total 19 5.28000000
R-Square Coeff Var Root MSE Y Mean
0.587121 3.386427 0.369121 10.90000
Source DF Type I SS Mean Square F Value Pr > F
A 3 3.10000000 1.03333333 7.58 0.0022
Monday, February 1, 2021 06:21:52 PM 66
The GLM Procedure
Dependent Variable: Y
Source DF Type III SS Mean Square F Value Pr > F
A 3 3.10000000 1.03333333 7.58 0.0022
Monday, February 1, 2021 06:21:52 PM 67
The GLM Procedure
Tukey's Studentized Range (HSD) Test for Y
NOTE: This test controls the Type I experimentwise error rate, but it
generally has a higher Type II error rate than REGWQ.
Alpha 0.05
Error Degrees of Freedom 16
Error Mean Square 0.13625
Critical Value of Studentized Range 4.04606
Minimum Significant Difference 0.6679
/* Lesson 15-02 */
/* File Name = les1502.sas 02/02/21 */
options nocenter linesize=78 pagesize=30;
options locale='en_US';
/* options locale='ja_JP'; */
proc printto print = 'StatM20/les1502-Results.txt' new;
ods listing gpath='StatM20/SAS_ODS15';
data polymer;
infile 'StatM20/table821.csv'
firstobs=2
dlm=',' dsd
encoding=sjis termstr=crlf
;
input A R Y;
proc print data=polymer;
run;
proc glm data=polymer;
class A;
model Y = A;
means A / tukey;
run;
/* Lesson 15-03 */
/* File Name = les1503.sas 02/02/21 */
options nocenter linesize=78 pagesize=30;
options locale='en_US';
/* options locale='ja_JP'; */
proc printto print = 'StatM20/les1503-Results.txt' new;
ods listing gpath='StatM20/SAS_ODS15';
data gakusei;
infile 'StatM20/StudAll20e.csv'
firstobs=8 dlm=',' dsd missover
encoding=sjis termstr=crlf;
input sex $ shintyou taijyuu kyoui
jitaku : $10. kodukai carryer $ tsuuwa;
if shintyou='.' or taijyuu='.' or kyoui='.' then delete;
if kyoui<60 then delete;
if taijyuu>85 then delete;
proc print data=gakusei(obs=5);
run;
proc means data=gakusei;
run;
: 計算結果を out_clustに、クラス数を 2に指定
proc fastclus data=gakusei out=out_clust maxclusters=2;
var shintyou taijyuu kyoui; : 変量を指定
run;
proc plot data=out_clust;
plot shintyou*taijyuu=cluster; : プロット場所にクラスター番号を表示
plot taijyuu*kyoui=cluster;
plot kyoui*shintyou=cluster;
run;
proc print data=out_clust(obs=20); : 計算結果の出力(形式1)
run;
/* Output to text file */ : 計算結果をテキストファイルに書き出す(形式2)
data _null_; : ファイルに書き出す
set out_clust; : 書き出すデータセットを指定
file 'StatM20/les1503-OutValue.txt'; : ファイル名を指定
put shintyou taijyuu kyoui cluster distance; : 書き出す変量を指定
run;
/* Output to CSV file */ : 計算結果をCSVファイルに書き出す(形式3)
proc export data=out_clust : 書き出すデータセットを指定
outfile= "StatM20/les1503-OutCSV.csv" : ファイル名を指定
dbms=CSV replace;
run;
Monday, February 1, 2021 05:27:17 PM 18
The MEANS Procedure
Variable N Mean Std Dev Minimum Maximum
--------------------------------------------------------------------------
shintyou 149 168.6973154 8.3515005 146.7000000 185.0000000
taijyuu 149 59.1489933 8.9139400 35.0000000 80.0000000
kyoui 149 86.2590604 6.4126671 63.0000000 110.0000000
kodukai 138 47083.33 54619.87 0 300000.00
tsuuwa 74 6511.46 4026.49 350.0000000 25000.00
--------------------------------------------------------------------------
Monday, February 1, 2021 05:27:17 PM 19
The FASTCLUS Procedure
Replace=FULL Radius=0 Maxclusters=2 Maxiter=1
Initial Seeds
Cluster shintyou taijyuu kyoui
-------------------------------------------------------------
1 178.0000000 78.0000000 110.0000000
2 152.0000000 35.0000000 77.0000000
Criterion Based on Final Seeds = 5.4963
Cluster Summary
Maximum Distance
RMS Std from Seed Radius Nearest
Cluster Frequency Deviation to Observation Exceeded Cluster
-----------------------------------------------------------------------------
1 84 5.2983 23.4776 2
2 65 5.7051 20.9336 1
Monday, February 1, 2021 05:27:17 PM 20
The FASTCLUS Procedure
Replace=FULL Radius=0 Maxclusters=2 Maxiter=1
Cluster Summary
Distance Between
Cluster Cluster Centroids
-----------------------------
1 20.1845
2 20.1845
Statistics for Variables
Variable Total STD Within STD R-Square RSQ/(1-RSQ)
------------------------------------------------------------------
shintyou 8.35150 5.60318 0.552909 1.236682
taijyuu 8.91394 5.50325 0.621423 1.641470
kyoui 6.41267 5.32740 0.314498 0.458785
OVER-ALL 7.96509 5.47913 0.530001 1.127665
Pseudo F Statistic = 165.77
Monday, February 1, 2021 05:27:17 PM 21
The FASTCLUS Procedure
Replace=FULL Radius=0 Maxclusters=2 Maxiter=1
Approximate Expected Over-All R-Squared = 0.32147
Cubic Clustering Criterion = 12.325
WARNING: The two values above are invalid for correlated variables.
Cluster Means
Cluster shintyou taijyuu kyoui
-------------------------------------------------------------
1 174.1416667 65.3095238 89.4119048
2 161.6615385 51.1876923 82.1846154
Cluster Standard Deviations
Cluster shintyou taijyuu kyoui
-------------------------------------------------------------
1 4.896426244 5.472661938 5.503830096
2 6.404629485 5.542661468 5.089487657
Monday, February 1, 2021 05:27:17 PM 22
Plot of shintyou*taijyuu. Symbol is value of CLUSTER.
200 +
|
|
|
| 1
| 1 1 1 1
180 + 2 1 1 1 1 1 1 11 1
| 2 11 11 1 1111 1 1 11
| 2 11 11 11 111 1 1
shintyou | 22 2 2 111 1 1 11 1
| 22 2 22 22 2 22 2 1
| 222 2 2 2 2 1 1 1
160 + 22 22 2 2 2
| 2 22 2 22 2 2
| 2 2 2 2 2
| 2
| 2 2
|
140 +
--+------------+------------+------------+------------+------------+-
30 40 50 60 70 80
taijyuu
NOTE: 51 obs hidden.
Monday, February 1, 2021 05:27:17 PM 23
Plot of taijyuu*kyoui. Symbol is value of CLUSTER.
80 + 1
| 11 1
| 1 11 1
| 1 1 11 1 1 1
| 1 11 1 11111 11 11 1
| 2 1 1 1 11 11 11 1
60 + 2 2 2 1 12 2 1
| 2 22 2 2 11 22 2 12 1
| 2 2 2 22
taijyuu | 2 2 22 2 2 2
| 2 22 2 22 222
| 2 2 2
40 + 2 2
| 2
|
|
|
|
20 +
--+------------+------------+------------+------------+------------+--
60 70 80 90 100 110
kyoui
NOTE: 61 obs hidden.
Monday, February 1, 2021 05:27:17 PM 24
Plot of kyoui*shintyou. Symbol is value of CLUSTER.
120 +
|
|
| 1
|
| 1
100 + 1 1
| 1 1 1 1111 11
| 1 1 1 1 1 1 1 1
kyoui | 2 2 12 2 111 11111 1 1 1
| 2 2 2 2 2 22 2 222 22 12 11 11111 11 1 1
| 22 2 22 22 222 1 11 1 1
80 + 2 2 2 2 2 2 2 2 12 1 1 1
| 22 2 2 2 2 2 2 2 1 2
|
| 2 2
|
| 2
60 +
-+-------------+-------------+-------------+-------------+-------------+
140 150 160 170 180 190
shintyou
NOTE: 44 obs hidden.
Monday, February 1, 2021 05:27:17 PM 25
s D
h t k c C I
i a j o a t L S
n i k i d r s U T
t j y t u r u S A
O s y y o a k y u T N
b e o u u k a e w E C
s x u u i u i r a R E
1 F 146.7 41.0 85 自宅生 10000 Vodafone 6000 2 19.9233
2 F 148.0 43.0 80 自宅生 50000 DoCoMo 4000 2 17.6836
3 F 150.0 46.0 86 40000 . 2 14.8815
4 F 151.7 41.5 80 自宅生 35000 . 2 15.6260
5 F 152.0 35.0 77 自宅生 60000 DoCoMo 2000 2 20.9336
6 F 153.0 46.5 87 下宿生 10000 . 2 12.4492
7 F 153.0 55.0 78 自宅生 30000 . 2 11.2980
8 F 154.4 44.0 75 自宅生 9000 au 2000 2 13.8010
9 F 155.0 48.0 83 下宿生 180000 . 2 9.0430
10 F 156.0 42.0 85 自宅生 0 DoCoMo 15000 2 12.5706
11 F 156.0 46.0 82 自宅生 10000 Vodafone 7000 2 9.2728
12 F 156.0 48.0 70 自宅生 30000 . 2 14.6260
13 F 156.0 49.0 85 自宅生 25000 . 2 8.1843
14 F 156.0 50.0 82 自宅生 40000 Vodafone 10000 2 7.3459
15 M 156.0 61.0 90 自宅生 0 . 2 13.8662
16 F 156.5 45.0 80 下宿生 60000 au 10000 2 9.8403
17 F 157.0 53.0 84 自宅生 30000 . 2 6.4154
18 F 158.0 46.0 80 27500 Willcom 3000 2 8.1473
/* Lesson 14-05 */
/* File Name = les1405.sas 01/26/21 */
options nocenter linesize=78 pagesize=30;
options locale='en_US';
/* options locale='ja_JP'; */
proc printto print = 'StatM20/les1405-Results.txt' new;
ods listing gpath='StatM20/SAS_ODS14';
title "Sashelp.iris --- Fisher's Iris Data (1936)";
proc contents data=sashelp.iris varnum; : データの変量情報を表示する
ods select position; : データの指定方法にも注目
run;
title "The First Five Observations Out of 150";
proc print data=sashelp.iris(obs=5) noobs; : 先頭5サンプルを表示
run;
title "The Species Variable";
proc freq data=sashelp.iris; : 頻度集計
tables Species;
run;
proc fastclus data=sashelp.iris out=out_clust maxclusters=3; : クラスター分析
var SepalLength SepalWidth PetalLength PetalWidth;
run;
proc plot data=out_clust;
plot SepalLength*SepalWidth=cluster;
plot SepalLength*PetalLength=cluster;
plot SepalLength*PetalWidth=cluster;
plot SepalWidth*PetalLength=cluster;
plot SepalWidth*PetalWidth=cluster;
plot PetalLength*PetalWidth=cluster;
run;
title "Scatterplot Matrix for Iris Data";
proc sgscatter data=sashelp.iris; : [おまけ1] 散布図行列
matrix SepalLength SepalWidth PetalLength PetalWidth
/ group=Species;
run;
title "Scatterplot Matrix with histogram for Iris Data";
proc sgscatter data=sashelp.iris; : [おまけ2] ヒストグラム付き散布図行列
matrix SepalLength SepalWidth PetalLength PetalWidth
/ group=Species diagonal=(kernel histogram);
run;
title;
Sashelp.iris --- Fisher's Iris Data (1936) 242
Monday, January 25, 2021 06:31:16 PM
The CONTENTS Procedure
Variables in Creation Order
# Variable Type Len Label
1 Species Char 10 Iris Species
2 SepalLength Num 8 Sepal Length (mm)
3 SepalWidth Num 8 Sepal Width (mm)
4 PetalLength Num 8 Petal Length (mm)
5 PetalWidth Num 8 Petal Width (mm)
The First Five Observations Out of 150 243
Monday, January 25, 2021 06:31:16 PM
Sepal Sepal Petal Petal
Species Length Width Length Width
Setosa 50 33 14 2
Setosa 46 34 14 3
Setosa 46 36 10 2
Setosa 51 33 17 5
Setosa 55 35 13 2
The Species Variable Monday, January 25, 2021 06:31:16 PM 244
The FREQ Procedure
Iris Species
Cumulative Cumulative
Species Frequency Percent Frequency Percent
---------------------------------------------------------------
Setosa 50 33.33 50 33.33
Versicolor 50 33.33 100 66.67
Virginica 50 33.33 150 100.00
The Species Variable Monday, January 25, 2021 06:31:16 PM 245
The FASTCLUS Procedure
Replace=FULL Radius=0 Maxclusters=3 Maxiter=1
Initial Seeds
Cluster SepalLength SepalWidth PetalLength PetalWidth
---------------------------------------------------------------------------
1 77.00000000 38.00000000 67.00000000 22.00000000
2 57.00000000 44.00000000 15.00000000 4.00000000
3 49.00000000 25.00000000 45.00000000 17.00000000
Criterion Based on Final Seeds = 3.7097
Cluster Summary
Maximum Distance
RMS Std from Seed Radius Nearest
Cluster Frequency Deviation to Observation Exceeded Cluster
-----------------------------------------------------------------------------
1 33 3.8831 12.9226 3
2 50 2.7803 12.4803 3
3 67 4.1797 18.5320 1
The Species Variable Monday, January 25, 2021 06:31:16 PM 246
The FASTCLUS Procedure
Replace=FULL Radius=0 Maxclusters=3 Maxiter=1
Cluster Summary
Distance Between
Cluster Cluster Centroids
-----------------------------
1 18.3409
2 34.2516
3 18.3409
Statistics for Variables
Variable Total STD Within STD R-Square RSQ/(1-RSQ)
---------------------------------------------------------------------
SepalLength 8.28066 4.48242 0.710915 2.459187
SepalWidth 4.35866 3.24819 0.452092 0.825123
PetalLength 17.65298 4.29764 0.941527 16.101961
PetalWidth 7.62238 2.38707 0.903243 9.335201
OVER-ALL 10.69224 3.70171 0.881751 7.456709
Pseudo F Statistic = 548.07
The Species Variable Monday, January 25, 2021 06:31:16 PM 247
The FASTCLUS Procedure
Replace=FULL Radius=0 Maxclusters=3 Maxiter=1
Approximate Expected Over-All R-Squared = 0.62728
Cubic Clustering Criterion = 24.559
WARNING: The two values above are invalid for correlated variables.
Cluster Means
Cluster SepalLength SepalWidth PetalLength PetalWidth
---------------------------------------------------------------------------
1 69.00000000 30.96969697 58.27272727 21.27272727
2 50.06000000 34.28000000 14.62000000 2.46000000
3 59.47761194 27.61194030 44.52238806 14.53731343
Cluster Standard Deviations
Cluster SepalLength SepalWidth PetalLength PetalWidth
---------------------------------------------------------------------------
1 5.012484414 2.909948974 4.577613511 2.401467354
2 3.524896872 3.790643691 1.736639965 1.053855894
3 4.831582365 2.953966126 5.360795421 3.011736428
The Species Variable Monday, January 25, 2021 06:31:16 PM 248
Plot of SepalLength*SepalWidth. Symbol is value of CLUSTER.
|
S |
e 84 +
p |
a | 1 1 1 1
l | 1 1
72 + 1 1 1 1
L | 3 1 3 3
e | 1 3 3 3 3 3 1
n | 3 3 3 3 1 3 1 3 3 1
g 60 + 3 3 3 3 3 3 3 3
t | 3 3 3 3 3 3 2 2 2
h | 3 3 3 3 3 2 2 2 2 2
|3 3 3 3 2 2 2 2 2 2 2 2 2
( 48 + 3 3 2 2 2 2 2
m | 2 2 2 2 2 2 2
m | 2
) |
36 +
|
-+-------------+-------------+-------------+-------------+-------------+-
20 25 30 35 40 45
Sepal Width (mm)
NOTE: 64 obs hidden.
The Species Variable Monday, January 25, 2021 06:31:16 PM 249
Plot of SepalLength*PetalLength. Symbol is value of CLUSTER.
|
S |
e 84 +
p |
a | 1 1 1 1
l | 1 1
72 + 1111 1
L | 333 1 11 1 1
e | 3 33 331 1111
n | 333 33333 1111 1
g 60 + 3 3 3333333 3
t | 2 2 2 33 3333 3 333
h | 222 2 33 3 33
| 222222 2 3 3 3 3
( 48 + 2222 2 3 3
m | 2 222
m | 2
) |
36 +
|
---+---------+---------+---------+---------+---------+---------+--
10 20 30 40 50 60 70
Petal Length (mm)
NOTE: 53 obs hidden.
The Species Variable Monday, January 25, 2021 06:31:16 PM 250
Plot of SepalLength*PetalWidth. Symbol is value of CLUSTER.
|
S |
e 84 +
p |
a | 1 1 1
l | 1 1
72 + 1 1 1 1
L | 3 3 1 1
e | 3 3 3 3 1 1 1 1 1 1 1
n | 3 3 3 3 3 1 1 1 1 1
g 60 + 3 3 3 3 3 3 3
t | 2 2 2 3 3 3 3 3 3 3 3
h | 2 2 3 3 3 3 3
| 2 2 2 2 2 2 3 3 3
( 48 + 2 2 2 3 3
m | 2 2
m | 2
) |
36 +
|
---+---------+---------+---------+---------+---------+--
0 5 10 15 20 25
Petal Width (mm)
NOTE: 74 obs hidden.
The Species Variable Monday, January 25, 2021 06:31:16 PM 251
Plot of SepalWidth*PetalLength. Symbol is value of CLUSTER.
|
50 +
S |
e |
p |
a | 2
l | 22
40 + 2
W | 2 222 2 1 1
i | 2 22 1
d | 2222
t | 2222 2 3 3 1 11 1
h | 22222 33 333 1 11111 11
30 + 2 22 2 33 333 33331 1 11 1 1
( | 2 3 3333 33333 3 1 1 1 1
m | 3 3333 3 3 3 3 3 1
m | 3 33 3 33 1
) | 2 3 33 3 3
| 3 3 3
20 + 3
---+---------+---------+---------+---------+---------+---------+--
10 20 30 40 50 60 70
Petal Length (mm)
NOTE: 39 obs hidden.
The Species Variable Monday, January 25, 2021 06:31:16 PM 252
Plot of SepalWidth*PetalWidth. Symbol is value of CLUSTER.
|
50 +
S |
e |
p |
a | 2
l | 2 2
40 + 2
W | 2 2 2 1 1
i | 2 2 2 1
d | 2 2 2
t | 2 2 2 2 3 1 1 1 1
h | 2 2 3 3 3 3 1 1 1
30 + 2 2 2 3 3 3 3 1 3 1 1 1 1 1
( | 2 3 3 3 3 3 1 1 1 1 3
m | 3 3 3 3 3 3 3 1
m | 3 3 3 3 1 3 3
) | 2 3 3 3
| 3 3
20 + 3
---+---------+---------+---------+---------+---------+--
0 5 10 15 20 25
Petal Width (mm)
NOTE: 69 obs hidden.
The Species Variable Monday, January 25, 2021 06:31:16 PM 253
Plot of PetalLength*PetalWidth. Symbol is value of CLUSTER.
|
P 72 +
e | 1 1 1 1
t | 1 1
a 60 + 1 1 1 1 1 1 1
l | 3 1 1 1 1 1 1
| 3 3 3 3 3 1 1 3
L 48 + 3 3 3 3 3 3 3
e | 3 3 3 3 3 3
n | 3 3 3 3 3
g 36 + 3 3
t | 3 3
h |
24 +
( | 2 2
m | 2 2 2 2 2 2
m 12 + 2 2 2 2
) |
|
0 +
---+---------+---------+---------+---------+---------+--
0 5 10 15 20 25
Petal Width (mm)
NOTE: 88 obs hidden.