TY - JOUR
T1 - Estimating Cross-Classified Population Counts of Multidimensional Tables
T2 - An Application to Regional Australia to Obtain Pseudo-Census Counts
AU - Suesse, Thomas
AU - Namazi-Rad, Mohammad Reza
AU - Mokhtarian, Payam
AU - Barthélemy, Johan
N1 - Publisher Copyright:
© 2017 Thomas Suesse et al., published by De Gruyter Open 2017.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2017/12/1
Y1 - 2017/12/1
N2 - Estimating population counts for multidimensional tables based on a representative sample subject to known marginal population counts is not only important in survey sampling but is also an integral part of standard methods for simulating area-specific synthetic populations. In this article several estimation methods are reviewed, with particular focus on the iterative proportional fitting procedure and the maximum likelihood method. The performance of these methods is investigated in a simulation study for multidimensional tables, as previous studies are limited to 2 by 2 tables. The data are generated under random sampling but also under misspecification models, for which sample and target populations differ systematically. The empirical results show that simple adjustments can lead to more efficient estimators, but generally, at the expense of increased bias. The adjustments also generally improve coverage of the confidence intervals. The methods discussed in this article along with standard error estimators, are made freely available in the R package mipfp. As an illustration, the methods are applied to the 2011 Australian census data available for the Illawarra Region in order to obtain estimates for the desired three-way table for age by sex by family type with known marginal tables for age by sex and for family type.
AB - Estimating population counts for multidimensional tables based on a representative sample subject to known marginal population counts is not only important in survey sampling but is also an integral part of standard methods for simulating area-specific synthetic populations. In this article several estimation methods are reviewed, with particular focus on the iterative proportional fitting procedure and the maximum likelihood method. The performance of these methods is investigated in a simulation study for multidimensional tables, as previous studies are limited to 2 by 2 tables. The data are generated under random sampling but also under misspecification models, for which sample and target populations differ systematically. The empirical results show that simple adjustments can lead to more efficient estimators, but generally, at the expense of increased bias. The adjustments also generally improve coverage of the confidence intervals. The methods discussed in this article along with standard error estimators, are made freely available in the R package mipfp. As an illustration, the methods are applied to the 2011 Australian census data available for the Illawarra Region in order to obtain estimates for the desired three-way table for age by sex by family type with known marginal tables for age by sex and for family type.
KW - Census data
KW - count estimation
KW - IPFP
KW - Log-linear model
KW - model-based inference
KW - synthetic population
UR - http://www.scopus.com/inward/record.url?scp=85036530702&partnerID=8YFLogxK
U2 - 10.1515/jos-2017-0048
DO - 10.1515/jos-2017-0048
M3 - Article
AN - SCOPUS:85036530702
SN - 0282-423X
VL - 33
SP - 1021
EP - 1050
JO - Journal of Official Statistics
JF - Journal of Official Statistics
IS - 4
ER -