Methods for Age Standardizing Survey Estimates
Age-adjusted estimates from OHS, CCHS or other complex survey data, may also be calculated using the same basic method shown in Direct Standardization (SRATES) with slight modification. For any data source, one needs the following elements to calculate a directly standardized rate along with its confidence limits:
1) The size of the standard population in each age group.
2) The rate (point estimate) within each age group, within each sub-population being compared.
3) The variance for each rate (point estimate) within each age group, within each population being compared.
For complex survey data, the following must also be true:
1) The age-specific rates (point estimates) must be correct, descriptive estimates (i.e., weighted and accounting for missing data).
2) The age-group specific variance estimates must account for the complex survey design or design effect. The variance formula from Direct Standardization (SRATES) does not apply.
These variance estimates can be obtained using any of the available tools including the CV look-up tables and utilities, and the Statistics Canada bootstrapping programs. Once these elements have been assembled, the calculations are straightforward (see example below).
Data users who have access to the full sampling design information (e.g., sample design variables suppressed in the public use datafiles) may use specialized software packages, such as SUDAAN, Stata or WesVar, to calculate the stratum-specific variance estimates. These packages may also be used to calculate the standardized rate and confidence limits in one pass, although some additional programming is required.
The Bootstrap variance programs provided by Statistics Canada for use with the CCHS, NPHS and OHS may also be modified to include a macro to calculate age-standardized rates. In this case, the standardization macro is inserted into the StatCan program along with the existing ones that produce estimates for ratios, numbers and regression models, etc. The user then modifies the full program to call this macro as needed. As with the other types of estimates, the whole process of calculating the standardized rate is bootstrapped (repeated) 500 times and the summary results are reported. Obviously this requires some advanced programming in SAS or SPSS, and quite a bit of computer time per run.
Effects of Applying Standardization to OHS Estimates
The OHS 1996/97 sharing file is weighted to 1996 Census figures for the population included in the survey (i.e., non-institutionalized rather than to all Ontario residents). Standardization using the recommended 1991 census population as the standard will change estimates for the Province and regions. Even if 1996 census data for all Ontario residents is used as the standard population, the age-adjusted estimates will differ somewhat from the original weighted estimates. That is because the original survey weights are based on the target population of the survey (non-institutionalized individuals) as opposed to the entire population. For comparisons within the dataset, the weighted sample itself (or portion with non-missing data) might be used as the standard population to compare sub-regions. One nice feature here is that the crude (weighted) and age-standardized rates for the whole Province are the same.
Standardization will not entirely account for differences between surveys such as the 1990 and 1996/97 OHS. Method effects can affect comparisons, as can the proportion and characteristics of the population living in and outside of institutions. Finally, other methods may be used to adjust for age when analysing OHS data. In many cases, regression modelling would be easier to calculate and interpret, and would take advantage of the opportunities within the OHS surveys to introduce other control variables.
EXAMPLE: CALCULATION OF DIRECTLY STANDARDIZED ESTIMATES IN COMPLEX SURVEY DATA FROM OHS 1996/97
|Stratum I=(age group a)||Number in standard pop’n(wi) ||Population 1 ||Population 2 |
|Rate b (%) for age group i ||Variance c of rate (var[ratei]) ||Rate b (%) for age group i ||Variance c of rate (var[ratei]) |
|1 ||400 ||30 ||1 ||40 ||12 |
|2 ||350 ||50 ||2 ||60 ||15 |
|3 ||250 ||70 ||3 ||80 ||23 |
|Total ||1000 || || || || |
a) Just three age groups are shown for simplicity – five-year age groups are recommended.
b) The age-specific rates must be base on weighted data.
c) Variance estimates must account for the complex survey design (e.g., using CV look-up tools, bootstrap programs etc.).
|Population 1: |
= 400(30) + 350 (50) + 250(70)
|Population 2: |
= 400(40) + 350 (60) + 250(80)
Variance Estimates for Standardized Rates:
| = (wi2 * var[rate i])|
= 4002(1) + 3502(2) + 2502(3)
= (16,000 + 24,500 + 18,750) / 1,000,000
| = (wi2 * var[rate i])|
= 4002(12) + 3502(15) + 2502(23)
95% Confidence Limits for Standardized Estimates:
Adjusted rate ± 1.96 * square root of (var of adjusted rate)
= 40 ± 1.96 * square root of (0.593)
= 40 ± 1.51
~= (38.5, 40.5)
|Adjusted rate ± 1.96 * square root of (var of adjusted rate) |
= 57 ± 1.96 * square root of (5.19)
= 57 ± 4.47
~= (52.5, 61.5)