A Report on the 2001 Post-enumeration Survey

Chapter 2: About the post-enumeration survey

2.1 Objectives

The 2001 PES was a sample survey of individuals in permanent private dwellings and was undertaken two weeks after the 2001 Census of Population and Dwellings (held on 6 March). The objective of the PES was to measure the level of coverage (undercount/overcount) in the 2001 Census. The survey did not aim to check the general accuracy or quality of the responses to specific questions in the census.

2.2 Scope – who made up the survey population?

The 2001 PES was based on a stratified sample of about 11,000 permanent private dwellings. The survey population consisted of New Zealand residents, either usually resident in a New Zealand private dwelling, or staying at one during the survey period. For practical reasons, temporary and non-private dwellings were excluded from the survey, as were dwellings in remote areas. In line with international statistical practices, the following population subgroups were therefore excluded:

  • overseas visitors
  • people living in non-private dwellings (eg hospitals, prisons, hotels)
  • people living in temporary private dwellings (eg tents, caravans, yachts)
  • people who died after census night
  • babies born after census night
  • overseas diplomats, their families and people living with them
  • people on offshore islands (except Waiheke Island, which was included).

2.3 Sample design

The 2001 PES adopted the sample design of Statistics New Zealand’s Household Labour Force Survey (HLFS). The main reasons for choosing the HLFS design included:

  • reduced costs for enumeration and field collection:
  • interviewers were already working in and familiar with the geographic areas used in the sample
  • fieldwork made use of existing maps and street listings
  • minimisation of respondent burden by controlling the overlap between the PES and other household surveys.

The sampling process was rather complex (Figure 1). The geographical framework of New Zealand consists of 36,946 meshblocks (a meshblock in urban areas is usually a block of residential area containing about 40 dwellings surrounded by streets; in rural areas a meshblock covers a much wider area because dwellings are sparsely spread). For the purposes of the HLFS, these meshblocks are aggregated into 19,102 Primary Sampling Units (PSUs). To improve the sampling efficiency these PSUs are stratified into 120 groups (or strata) based on region, urban/rural mix, Māori population, and other socio-economic variables (income, employment status, age 65+ population). Each stratum consists of about 160 PSUs on average.

Across the 120 strata, 1,760 PSUs have been randomly selected for the HLFS. The PES randomly selected PSUs from among these, using sampling factions dependent on the stratum characteristics:

  • 7/8 of PSUs from strata with high numbers of Māori, Pacific and Asian residents
  • 5/8 of PSUs from other strata in the Auckland Region or with high numbers of Māori residents
  • 1/2 of PSUs for all other strata.

Overseas studies suggest that ethnic minorities and young persons are more likely to be missed by the census (Australian Bureau of Statistics 1997, US Bureau of the Census 1996). The higher sample ratios for ethnically diverse areas and for Auckland were thus designed to help increase the accuracy of the undercount estimates for subgroups of the population by reducing their sample errors.

Each PSU in the HLFS comprises at least seven panels, and each panel consists of about nine randomly selected private dwellings. Most of the panels within a PSU are used by the HLFS on a rotational basis with one panel being used for each survey quarter in a year. A spare panel from each of the PSUs was used to make up the PES sample frame for sample selection.

The 2001 PES sample comprised 999 PSU panels containing about 11,000 dwellings (or about 0.7 percent of total permanent private dwellings in New Zealand).

Figure 1
2001 Post-enumeration Survey: Sampling Process

2001 Post-enumeration Survey Sampling Process flowchart


 

2.4 Data collection

The survey was carried out during 21 March–3 April 2001, following the completion of census fieldwork. The survey period was chosen to avoid overlap of census enumerators and PES interviewers in the field, and to avoid a clash with the Easter holidays, while being close enough to census date (6 March) to assist respondent recall. Data was collected by 159 specially trained interviewers using a household questionnaire. Information on occupants of the dwelling who satisfied the scope and coverage criteria was collected through a face-to-face interview wherever practicable. Alternatively, a proxy interview was conducted (ie details were obtained from another adult in the dwelling) and a follow-up interview was done over the telephone, unless the respondent insisted on a face-to-face interview. The actual number of responding dwellings was about 9,500 (or about 25,000 persons).

Personal details sought on the PES questionnaire included: name, sex, date of birth or age, ethnicity and address. Besides usual address and census night address, the survey also collected information on any other addresses where the person might have been included on any other census form. This was to help increase the chances of finding and matching any individual census forms for a particular person, and to help identify multiple counts. A copy of the PES questionnaire is included in Appendix 1.

In order for the PES to achieve its objectives, the processes needed to be independent of the census. To ensure this independence, the PES:

  • used no census field staff
  • was conducted after the census fieldwork was completed to avoid contact between census enumerators and PES interviewers
  • used interviewers to gather information from occupants of the dwelling, whereas the census relied on individuals to fill in the census forms.

While it is possible that people who were missed in the census may also have been missed in the PES, it is generally accepted that the coverage of the PES was more complete. The 2001 PES used more tightly controlled collection procedures and more experienced and better trained field staff than the census.

2.5 Matching and searching

The objective of matching was to determine if a PES respondent was counted in the census at each address at which they stated that they had completed a census form, or at each address where a census form may have been completed for them (search address). This was achieved by comparing the information given by PES respondents with the information given on census forms.

Matching and searching were performed clerically, using the PES questionnaires and the images of census documents. The first part of the matching process involved locating the census dwelling/address that corresponded with the one given in the PES. This provided an estimate of the number of dwellings found in the survey but missed in the census.

Once a dwelling was located, separate searches were carried out for each person to locate their census form(s). Matching determines whether a person was counted or not counted in the census at each search address. The most important variables available for comparison were:

  • name
  • date of birth (or age)
  • sex.

Other information used included:

  • ethnic group
  • usual resident or visitor
  • household structure and relationships
  • where the respondent completed a census form.

The underlying policy for person matching was that unless there is clear evidence that the person under consideration was not counted, they will be regarded as counted at the address being examined. This basic assumption was important to ensure that the number of non-matches is not unduly increased because of inexact matches.

The matching of a person based on two data sources is not an exact science. There are always cases where a match is not exact, yet it constitutes a match. For example, a woman may have married between the census and the PES, and changed her maiden name to her married name. Therefore, her surname in the census form will not match that in the PES form, but all other data will be the same (ie first name, date of birth, sex, ethnicity). In this case a conclusion that a match has been made seems justified.

2.6 Estimation

Data collected in the PES was used to estimate census undercoverage for both people and private dwellings. An estimate of the true census population is based on the following equation, which accounts for persons/dwellings either missed or counted more than once in the census:

True census population equation.

X = PES estimate of the true census population
x = PES estimate of the population who should have been counted in the census less substitutes and late returns
y = PES estimate of the population who were counted in the census less substitutes and late returns
Y = Census count of the population in permanent private dwellings less substitutes and late returns
DLR = Census count of the population in private dwellings with substitutes and late returns

A substitute is a census form created for a person/dwelling when no form was received but contact had been made by the enumerator. A late return is a census form returned after PES interviews had started. Being selected for the PES may prompt some respondents to return their census forms if they had not already done so at the time of the PES. These people are excluded from the population adjustment calculations to avoid biasing the PES estimates.

More details on person and dwelling weights, and non-response adjustments are included in Appendix 2.

2.7 Sampling and non-sampling errors

The PES estimates are subject to sample errors because they are derived from responses from a sample of dwellings (0.7 percent of all permanent private dwellings), rather than all dwellings. The sample error estimated for each 'point estimate' produces an 'interval estimate', which has a probability of 0.95 of containing the true value. The PES undercount estimates with high sample errors should be used with caution.

The imprecision due to sampling variability should not be confused with non-sampling errors. The non-sampling errors can arise from: (a) imperfections in reporting by respondents, (b) errors in data collection, and (c) errors in data processing. Such non-sampling errors are minimised by careful design of forms, training and supervision of interviewers, and efficient operating procedures.