Stata commands in white,
output in green and yellow,
warnings are in red
Comments on interpretation of output are in blue.
For comments on running the analyses go to the commented code file.
. svyset [pwei=ind_wt],psu(psu) strata(stratum)
pweight is ind_wt strata is stratum psu is psu
. svyprop intuse back to top
--------------------------------------------------------------------- pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28642.333 --------------------------------------------------------------------- Here sum of weights add approximately to sample size, since weights are not grossing up weights. Hence population total misleading, but results are OK. Survey proportions estimation +-------------------------------------------+ | intuse Obs Est. Prop. Std. Err. | |-------------------------------------------| | 0 19823 0.658436 0.003405 | | 1 8862 0.341564 0.003405 | +-------------------------------------------+ . svymean intuse , ci deff To get design effects, need to get mean of the 0/1 vraiable Survey mean estimation pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28642.333 ---------------------------------------------------------------------- Mean | Estimate Std. Err. [95% Conf. Interval] Deff -------+------------------------------------------------------------ intuse | .3415642 .0034046 .3348905 .3482378 1.478395 ---------------------------------------------------------------------- . svymean intuse, deff deft by(sex) back to top Survey mean estimation pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28642.333 ----------------------------------------------------------------- Mean Subpop. | Estimate Std. Err. Deff Deft ---------------+------------------------------------------------- intuse | male | .3851593 .0051239 1.405104 1.185371 female | .3070535 .0043296 1.410531 1.187658 ----------------------------------------------------------------- . svymean rc5 stratum with only one PSU detected This failure happens because restricting to internet users only gives at least one stratum with a lonely PSU . Next command shows which ones. It would be tiresome to sort them all out. svydes if intuse==1 pweight: ind_wt Strata: stratum PSU: psu #Obs per PSU Strata ---------------------------- stratum #PSUs #Obs min mean max -------- -------- -------- -------- -------- -------- 100A 110 110 1 1.0 1 100B 60 60 1 1.0 1 Lines missed out here 180E 10 10 1 1.0 1 180F 18 18 1 1.0 1 180G 17 17 1 1.0 1 180H 1* 1 1 1.0 1 180I 63 63 1 1.0 1 A * indicates a lonely PSU More lines missed here A few more lonely PSUs could be seen in them. . svyset [pweight=ind_wt], clear( strata psu pweight ) back to top pweight is ind_wt . svymean intuse , ci deff Survey mean estimation pweight: ind_wt Number of obs = 28685 Strata: <one> Number of strata = 1 PSU: <observations> Number of PSUs = 28685 Population size = 28642.333 ------------------------------------------------------------------------ Mean | Estimate Std. Err. [95% Conf. Interval] Deff ---------+-------------------------------------------------------------- intuse | .3415642 .0032671 .3351606 .3479678 1.361344 ------------------------------------------------------------------------ . svyset ,psu(psu) pweight is ind_wt psu is psu . svymean intuse , ci deff Survey mean estimation pweight: ind_wt Number of obs = 28685 Strata: <one> Number of strata = 1 PSU: psu Number of PSUs = 11937 Population size = 28642.333 ---------------------------------------------------------------------- Mean | Estimate Std. Err. [95% Conf. Interval] Deff ---------+------------------------------------------------------------ intuse | .3415642 .0037742 .334166 .3489623 1.816821 ---------------------------------------------------------------------- . svyset, clear(psu) strata(stratum) pweight is ind_wt strata is stratum . svymean intuse , ci deff Survey mean estimation pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: <observations> Number of PSUs = 28685 Population size = 28642.333 --------------------------------------------------------------------- Mean | Estimate Std. Err. nf. Interval] Deff ---------+----------------------------------------------------------- intuse | .3415642 .0031608 .3353689 .3477594 1.274203 ---------------------------------------------------------------------- . svyset,psu(psu) pweight is ind_wt strata is stratum psu is psu This restores the actual correct design . svytab intuse sex, count row percent back to top pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28642.333 ------------------------------------- Whether | person | uses the | --interne | sex t | male female Total ----------+-------------------------- 0 | 7781 1.1e+04 1.9e+04 | 41.26 58.74 100 | 1 | 4874 4909 9783 | 49.82 50.18 100 | Total | 1.3e+04 1.6e+04 2.9e+04 | 44.18 55.82 100 ------------------------------------- Key: weighted counts row percentages Pearson: Uncorrected chi2(1) = 191.8936 Design-based F(1, 11656) = 143.8122 P = 0.0000 The counts in this table are the weighted counts, i.e. the sum of the weights in each cell. This is not very helpful especially since we don't have grossing up weights here. Actual base numbers in each row would be much more useful. The format for counts is also not good, but could easily be fixed with a format subcommand (see table below). . svytab intuse rc5, stratum with only one PSU detected . svytab sex groc, count row percent format(%10.2f) pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28642.333 ---------------------------------------- | groc sex | 0 1 Total ----------+----------------------------- male | 12310.71 344.77 12655.48 | 97.28 2.72 100.00 | female | 15513.40 473.45 15986.86 | 97.04 2.96 100.00 | Total | 27824.11 818.22 28642.33 | 97.14 2.86 100.00 ---------------------------------------- Key: weighted counts row percentages Pearson: Uncorrected chi2(1) = 1.4349 Design-based F(1, 11656) = 1.0726 P = 0.3004 . svymean intuse, ci deff by(council) back to top Survey mean estimation pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28642.333 --------------------------------------------------------------------------- Mean Subpop. | Estimate Std. Err. [95% Conf. Interval] Deff ---------------+----------------------------------------------------------- intuse | Aberdeen | .4259058 .015494 .3955349 .4562766 1.285795 Clackman | .300511 .0301624 .2413875 .3596344 1.158457 Orkney_I | .3083178 .0301204 .2492768 .3673588 .4625155 PerthKin | .3762066 .0225993 .3319081 .420505 1.634168 Renfrews | .3161091 .0174258 .2819515 .3502666 1.417949 only a few rows shown ------------------------------------------------------------------------------ . svymean intuse, ci deff by(council) srssubpop This gives design effects with respecyt to simple random sampling in the subgroups Survey mean estimation pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28642.333 ------------------------------------------------------------------------------ Mean Subpop. | Estimate Std. Err. [95% Conf. Interval] Deff ---------------+-------------------------------------------------------------- intuse | Aberdeen | .4259058 .015494 .3955349 .4562766 1.141851 Clackman | .300511 .0301624 .2413875 .3596344 2.246256 Orkney_I | .3083178 .0301204 .2492768 .3673588 2.612066 PerthKin | .3762066 .0225993 .3319081 .420505 1.466841 Renfrews | .3161091 .0174258 .2819515 .3502666 1.261359 only a few rows shown ------------------------------------------------------------------------------ . tabulate groupinc, generate(groupinc) back to top getting dummy variables for regression groupinc | Freq. Percent Cum. ------------+----------------------------------- missing | 756 2.64 2.64 under 10K | 8,840 30.82 33.45 10-20K | 10,206 35.58 69.03 20-30k | 5,472 19.08 88.11 30-50k | 2,876 10.03 98.13 50K+ | 535 1.87 100.00 ------------+----------------------------------- Total | 28,685 100.00 . svylogit intuse groupinc3-groupinc6 if groupinc>0,prob deff deft or Survey logistic regression pweight: ind_wt Number of obs = 28685 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28642.333 F( 4, 11653) = 686.78 Prob > F = 0.0000 ------------------------------------------------------------------------------ intuse | Odds Ratio Std. Err. t P>|t| Deff Deft -------------+---------------------------------------------------------------- groupinc3 | 2.0628 .0916902 16.29 0.000 1.245006 1.115799 groupinc4 | 5.224951 .2463328 35.07 0.000 1.322352 1.149936 groupinc5 | 13.08222 .7537283 44.63 0.000 1.420008 1.191641 groupinc6 | 22.63186 2.864421 24.65 0.000 1.587204 1.259843 ------------------------------------------------------------------------------ . logistic intuse groupinc3-groupinc6 if groupinc>0, or This is simple unweighted regression Logistic regression Number of obs = 28685 LR chi2(4) = 4898.43 Prob > chi2 = 0.0000 Log likelihood = -15285.327 Pseudo R2 = 0.1381 ------------------------------------------------------------------------------ intuse | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- groupinc3 | 2.549186 .0980338 24.33 0.000 2.364106 2.748756 groupinc4 | 6.627238 .2735814 45.81 0.000 6.112148 7.185737 groupinc5 | 15.67939 .7972235 54.13 0.000 14.1922 17.32243 groupinc6 | 26.5441 2.922164 29.78 0.000 21.39251 32.93626 ------------------------------------------------------------------------------ . tabulate shs_6cla, generate (rural) Urban/rural classification | Freq. Percent Cum. --------------------------------------+----------------------------------- Urban settlements of over 125,000 pop | 10,298 35.96 35.96 Other urban | 8,352 29.16 65.12 Small access towns,3-10k | 2,937 10.26 75.38 Small remote towns, pop 3-10k | 1,290 4.50 79.88 Accessible rural, pop<3k, drive>0 | 3,264 11.40 91.28 Remote rural, pop<3k, drive>30 | 2,498 8.72 100.00 --------------------------------------+----------------------------------- Total | 28,639 100.00 . svylogit intuse groupinc3-groupinc6 rural2-rural6 if groupinc>0,prob or Survey logistic regression pweight: ind_wt Number of obs = 28639 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11922 Population size = 28593.511 F( 9, 11633) = 306.37 Prob > F = 0.0000 ------------------------------------------------------------------------------ intuse | Odds Ratio Std. Err. t P>|t| -------------+---------------------------------------------------------------- groupinc3 | 2.08292 .0927013 16.49 0.000 groupinc4 | 5.308005 .2510165 35.30 0.000 groupinc5 | 13.30205 .7706116 44.67 0.000 groupinc6 | 22.71765 2.889008 24.56 0.000 rural2 | .8277699 .0339711 -4.61 0.000 rural3 | .9343237 .0535642 -1.18 0.236 rural4 | .9006713 .0927712 -1.02 0.310 rural5 | .8625718 .0452299 -2.82 0.005 rural6 | 1.017198 .0685998 0.25 0.800 ------------------------------------------------------------------------------ . bspline,x(age) power(3) gen(bs) unrecognized command: bspline You will get this error if you have not installed the spline package from the internet via the help. . bspline,x(age) power(3) gen(bs) (1 missing value generated) This approach could be extended to produce m ore complicated fits like the one shown on the exemplar main page. WE show only a simple spline fit here. . svymlogit intuse bs1-bs4,noconst deft deff Survey multinomial logistic regression pweight: ind_wt Number of obs = 28684 Strata: stratum Number of strata = 281 PSU: psu Number of PSUs = 11937 Population size = 28641.023 F( 4, 11653) = 748.87 Prob > F = 0.0000 ------------------------------------------------------------------------------ intuse | Coef. Std. Err. Deff Deft -------------+---------------------------------------------------------------- bs1 | 3.267213 1.556268 1.487014 1.219432 bs2 | -.5498629 .41734 1.361043 1.166638 bs3 | .6622839 .447941 1.317879 1.147989 bs4 | -26.89027 2.000127 1.239484 1.113321 ------------------------------------------------------------------------------ (Outcome intuse==0 is the comparison group) . predict pr0 pr1 (option p assumed; predicted probabilities) (1 missing value generated) . plot pr1 age .57159 + | * | ****** | ******** P | ****** r | ***** ( | *** i | *** n | *** t | ** u | ** s | ** e | ** = | ** = | ** 1 | * ) | ** | *** | ** | **** .0158 + ****** +----------------------------------------------------------------+ 16 age 80