A survey analysis package has recently been added to R. It has
been written by Thomas Lumley of the University of Washington.
He has been very helpful in adapting the package following our
suggestions.
The survey package seems to be convenient from the point of view
of an experienced R user. We have developed the exemplars with
the version current in the first half of 2004. The current version
is 3.0 (May 2005). Details of improvements from earlier versions
can be found at the NEWS link from the suvey package home
page.Further updates are expected.
The new version 3.2 is eaven more improved, easier to use, and includes
generalised calibration methods.
The software covers a wide range of survey analysis methods.
In particular: - mean, quantiles, variance, tables, ratios, totals
- generalised linear models (e.g. linear regression, logistic
regression, Poisson models, etc.)
- proportional hazards models
There is also a new set of procedures for non-response weighting,
but we have not had time to investigate them yet.
These methods are implemented with standard errors from either
- Taylor linearisation in functions such as svydesign
,svymean and svyreg
- or replication methods such as balanced repeated replication
and the jackknife are available through functions such as svyrepdesign,
svrepmean and svrepreg
In earlier versions these were seperate functions. More recent
versions (2.9 and later) use the same functions for both methods
and R recognises which one to use from the properties of the design
you have declared.
(Click here for more info on standard
errors when designing surveys.)
The main source os Thomas Lumley's
home page.
Manual pages for the survey package are available while you work
and can be read online here.
A paper describing the package is Lumley T. (2004) "Analysis
of complex survey samples" Journal of Statistical Software
9(8) available here