SAS output for exemplar 2  

This file was produced using the SAS html output with the minimal style. You can also view the program that created this output.

Some comments have been added to the output in blue and preceeded by ****

Links in this page

Simple means and proportions Table 2.3
Effect of design aspects on precision of estimates table 2.4
Tables 2.6 onwards
Nice table layout from PROC TABULATE

The SAS System

The SURVEYMEANS Procedure **** Simple means and proportions
back to top uparrow

Data Summary
Number of Strata 281
Number of Clusters 11937
Number of Observations 28685
Sum of Weights 28642.3332
 
Statistics
Variable Label N Mean Std Error of Mean Lower 95%
CL for Mean
Upper 95%
CL for Mean
intuse Whether person uses the --internet 28685 0.341564 0.003405 0.334891 0.348238
The SAS System
Hours of internet use for users

The SURVEYMEANS Procedure

back to top uparrow
Data Summary
Number of Strata 281
Number of Clusters 11937
Number of Observations 28685
Sum of Weights 28642.3332
 
Class Level Information
Class Variable Label Levels Values
RC5 Time spent using internet each week 5 Up to 1 hour a week Over 1 hour, up to 5 hours Over 5 hours up to 10 hours Over 10 hours up to 20 hours Over 20 hours
 
Statistics
Variable N Mean Std Error
of Mean
Lower 95%
CL for Mean
Upper 95%
CL for Mean
RC5=Up to 1 hour a week
RC5=Over 1 hour, up to 5 hours
RC5=Over 5 hours up to 10 hours
RC5=Over 10 hours up to 20 hours
RC5=Over 20 hours
3664
3581
964
428
225
0.408287
0.406608
0.108295
0.050780
0.026031
0.006082
0.006038
0.003764
0.002755
0.001975
0.396363
0.394770
0.100916
0.045378
0.022160
0.420210
0.418445
0.115673
0.056181
0.029902

**** This gives proportions in each category when a formatted variable is used in SURVEYMEANS

internet use by sex

back to top uparrow
The SURVEYMEANS Procedure

Data Summary
Number of Strata 281
Number of Clusters 11937
Number of Observations 28685
Sum of Weights 28642.3332
 
Statistics
Variable Label N Mean Std Error of Mean Lower 95%
CL for Mean
Upper 95%
CL for Mean
intuse Whether person uses the --internet 28685 0.341564 0.003405 0.334891 0.348238
 
Domain Analysis: sex
sex Variable N Mean Std Error of Mean Lower 95%
CL for Mean
Upper 95%
CL for Mean
male intuse 12174 0.385159 0.005124 0.375116 0.395203
female intuse 16511 0.307053 0.004330 0.298567 0.315540

 

hours used by sex

back to top uparrow
The SURVEYMEANS Procedure

Data Summary
Number of Strata 281
Number of Clusters 11937
Number of Observations 28685
Sum of Weights 28642.3332
 
Class Level Information
Class Variable Label Levels Values
RC5 Time spent using internet each week 5 1 2 3 4 5
 
Statistics
Variable N Mean Std Error of Mean Lower 95%
CL for Mean
Upper 95%
CL for Mean
RC5=1
RC5=2
RC5=3
RC5=4
RC5=5
3664
3581
964
428
225
0.408287
0.406608
0.108295
0.050780
0.026031
0.006082
0.006038
0.003764
0.002755
0.001975
0.396363
0.394770
0.100916
0.045378
0.022160
0.420210
0.418445
0.115673
0.056181
0.029902
 
Domain Analysis: sex
sex Variable N Mean Std Error of Mean Lower 95%
CL for Mean
Upper 95%
CL for Mean
male RC5=1
RC5=2
RC5=3
RC5=4
RC5=5
1553
1770
590
300
155
0.356047
0.401499
0.132755
0.073166
0.036533
0.008349
0.008478
0.005789
0.004620
0.003345
0.339680
0.384879
0.121405
0.064108
0.029975
0.372414
0.418119
0.144104
0.082224
0.043091
female RC5=1
RC5=2
RC5=3
RC5=4
RC5=5
2111
1811
374
128
70
0.460160
0.411680
0.084006
0.028550
0.015603
0.008680
0.008559
0.004766
0.002867
0.002091
0.443144
0.394900
0.074663
0.022930
0.011504
0.477176
0.428461
0.093350
0.034170
0.019702

Results for the effect of different designs here

clustering no stratification
back to top uparrow

**** These analyses compare the effect that different designs would have had on the precision of estimating mean internet use. SAS does not have an option to calculate design effects, but for a simple proportion they can easily be calculated by hand. See below.

The SURVEYMEANS Procedure

Data Summary
Number of Clusters 11937
Number of Observations 28685
Sum of Weights 28642.3332
 
Statistics
Variable Label N Mean Std Error of Mean Lower 95%
CL for Mean
Upper 95%
CL for Mean
intuse Whether person uses the --internet 28685 0.341564 0.003774 0.334166 0.348962

**** Calculation for design effects are as follows. Variance of a simple proportion estimated from a random sample of size 28685 = (0.341564)(1-0.341564)/28685. Its square root 0.00280 would be the s.e. of a simple random sample. So the design factor is 0.003774/0.002800 = 1.3478 and the design effect (its square) is 1.8167. This agrees with output from Stata and R. YOu can do similar calculations for the other ones here.

stratification no clustering

The SURVEYMEANS Procedure

Data Summary
Number of Strata 281
Number of Observations 28685
Sum of Weights 28642.3332
 
Statistics
Variable Label N Mean Std Error of Mean Lower 95%
CL for Mean
Upper 95%
CL for Mean
intuse Whether person uses the --internet 28685 0.341564 0.003161 0.335369 0.347759

**** Calculation for design effects are as follows. Variance of a simple proportion estimated from a random sample of size 28685 = (0.341564)(1-0.341564)/28685. Its square root 0.00280 would be the s.e. of a simple random sample.

Here design factor is 0.003161/0.002800 = 1.13 and the design effect (its square) is 1.27.

no stratification or clustering

The SURVEYMEANS Procedure

Data Summary
Number of Observations 28685
Sum of Weights 28642.3332
 
Statistics
Variable Label N Mean Std Error of Mean Lower 95%
CL for Mean
Upper 95%
CL for Mean
intuse Whether person uses the --internet 28685 0.341564 0.003267 0.335161

0.347968

**** Calculation for design effects are as follows. Variance of a simple proportion estimated from a random sample of size 28685 = (0.341564)(1-0.341564)/28685. Its square root 0.00280 would be the s.e. of a simple random sample.

Here design factor is 0.003267/0.002800 = 1.17 and the design effect (its square) is 1.36.

Now results for Tables .

tables with wrong chi square results

back to top uparrow
The FREQ Procedure

Weighted frequency tables are OK but tests are wrong and totals are sums of weights, not actual respondents.

Frequency
Row Pct
Table of intuse by sex
intuse(Whether person uses the --internet) sex Total
1 2
0 7781.1
41.26
11078
58.74
18859
 
1 4874.4
49.82
4908.8
50.18
9783.2
 
Total 12655.5 15986.9 28642.3

Statistics for Table of intuse by sex

 
Statistic DF Value Prob
Chi-Square 1 191.6082 <.0001
Likelihood Ratio Chi-Square 1 191.1100 <.0001
Continuity Adj. Chi-Square 1 191.2610 <.0001
Mantel-Haenszel Chi-Square 1 191.6015 <.0001
Phi Coefficient   -0.0818  
Contingency Coefficient   0.0815  
Cramer's V   -0.0818  
 
Fisher's Exact Test
Cell (1,1) Frequency (F) 7781
Left-sided Pr <= F .
Right-sided Pr >= F .
   
Table Probability (P) .
Two-sided Pr <= P .

Sample Size = 28642.333183
tables with wrong chi square results

The FREQ Procedure

  PROC TABULATE output
nice table layout;

back to top uparrow
  Time spent using internet each week All base
no internet use Up to 1 hour a week Over 1 hour, up to 5 hours Over 5 hours up to 10 hours Over 10 hours up to 20 hours Over 20 hours
% % % % % % %
Missing data 65.4 8.8 16.6 9.2 . . 100 46
Urban settlements of over 125,000 pop 65.4 12.7 14.8 4.1 1.9 1.1 100 10298
Other urban 68.0 14.1 12.1 3.4 1.7 0.8 100 8352
Small access towns,3-10k 63.8 15.9 14.6 3.8 1.1 0.7 100 2937
Small remote towns, pop 3-10k 67.2 13.7 13.1 3.1 2.3 0.7 100 1290
Accessible rural, pop<3k, drive<30 63.8 15.3 15.0 3.5 2.0 0.4 100 3264
‚Remote rural, pop<3k, drive>30 65.0 16.0 13.8 2.8 1.4 1.0 100 2498