Estimating Cross-Classified Population Counts of Multidimensional Tables: An Application to Regional Australia to Obtain Pseudo-Census Counts

Thomas Suesse; Mohammad Reza Namazi-Rad; Payam Mokhtarian; Johan Barthélemy

doi:10.1515/jos-2017-0048

Estimating Cross-Classified Population Counts of Multidimensional Tables: An Application to Regional Australia to Obtain Pseudo-Census Counts

Thomas Suesse, Mohammad Reza Namazi-Rad, Payam Mokhtarian, Johan Barthélemy

Research output: Contribution to journal › Article › peer-review

3 Downloads (Pure)

Abstract

Estimating population counts for multidimensional tables based on a representative sample subject to known marginal population counts is not only important in survey sampling but is also an integral part of standard methods for simulating area-specific synthetic populations. In this article several estimation methods are reviewed, with particular focus on the iterative proportional fitting procedure and the maximum likelihood method. The performance of these methods is investigated in a simulation study for multidimensional tables, as previous studies are limited to 2 by 2 tables. The data are generated under random sampling but also under misspecification models, for which sample and target populations differ systematically. The empirical results show that simple adjustments can lead to more efficient estimators, but generally, at the expense of increased bias. The adjustments also generally improve coverage of the confidence intervals. The methods discussed in this article along with standard error estimators, are made freely available in the R package mipfp. As an illustration, the methods are applied to the 2011 Australian census data available for the Illawarra Region in order to obtain estimates for the desired three-way table for age by sex by family type with known marginal tables for age by sex and for family type.

Original language	English
Pages (from-to)	1021-1050
Number of pages	30
Journal	Journal of Official Statistics
Volume	33
Issue number	4
DOIs	https://doi.org/10.1515/jos-2017-0048
Publication status	Published - 1 Dec 2017
Externally published	Yes

Keywords

Census data
count estimation
IPFP
Log-linear model
model-based inference
synthetic population

Access to Document

10.1515/jos-2017-0048Licence: CC BY-NC-ND

article-p1021Final published version, 854 KBLicence: CC BY-NC-ND

Cite this

@article{ed2441e0836645649e95986a7470d3e4,

title = "Estimating Cross-Classified Population Counts of Multidimensional Tables: An Application to Regional Australia to Obtain Pseudo-Census Counts",

abstract = "Estimating population counts for multidimensional tables based on a representative sample subject to known marginal population counts is not only important in survey sampling but is also an integral part of standard methods for simulating area-specific synthetic populations. In this article several estimation methods are reviewed, with particular focus on the iterative proportional fitting procedure and the maximum likelihood method. The performance of these methods is investigated in a simulation study for multidimensional tables, as previous studies are limited to 2 by 2 tables. The data are generated under random sampling but also under misspecification models, for which sample and target populations differ systematically. The empirical results show that simple adjustments can lead to more efficient estimators, but generally, at the expense of increased bias. The adjustments also generally improve coverage of the confidence intervals. The methods discussed in this article along with standard error estimators, are made freely available in the R package mipfp. As an illustration, the methods are applied to the 2011 Australian census data available for the Illawarra Region in order to obtain estimates for the desired three-way table for age by sex by family type with known marginal tables for age by sex and for family type.",

keywords = "Census data, count estimation, IPFP, Log-linear model, model-based inference, synthetic population",

author = "Thomas Suesse and Namazi-Rad, {Mohammad Reza} and Payam Mokhtarian and Johan Barth{\'e}lemy",

year = "2017",

month = dec,

day = "1",

doi = "10.1515/jos-2017-0048",

language = "English",

volume = "33",

pages = "1021--1050",

journal = "Journal of Official Statistics",

issn = "0282-423X",

publisher = "Statistiska",

number = "4",

}

TY - JOUR

T1 - Estimating Cross-Classified Population Counts of Multidimensional Tables

T2 - An Application to Regional Australia to Obtain Pseudo-Census Counts

AU - Suesse, Thomas

AU - Namazi-Rad, Mohammad Reza

AU - Mokhtarian, Payam

AU - Barthélemy, Johan

PY - 2017/12/1

Y1 - 2017/12/1

N2 - Estimating population counts for multidimensional tables based on a representative sample subject to known marginal population counts is not only important in survey sampling but is also an integral part of standard methods for simulating area-specific synthetic populations. In this article several estimation methods are reviewed, with particular focus on the iterative proportional fitting procedure and the maximum likelihood method. The performance of these methods is investigated in a simulation study for multidimensional tables, as previous studies are limited to 2 by 2 tables. The data are generated under random sampling but also under misspecification models, for which sample and target populations differ systematically. The empirical results show that simple adjustments can lead to more efficient estimators, but generally, at the expense of increased bias. The adjustments also generally improve coverage of the confidence intervals. The methods discussed in this article along with standard error estimators, are made freely available in the R package mipfp. As an illustration, the methods are applied to the 2011 Australian census data available for the Illawarra Region in order to obtain estimates for the desired three-way table for age by sex by family type with known marginal tables for age by sex and for family type.

AB - Estimating population counts for multidimensional tables based on a representative sample subject to known marginal population counts is not only important in survey sampling but is also an integral part of standard methods for simulating area-specific synthetic populations. In this article several estimation methods are reviewed, with particular focus on the iterative proportional fitting procedure and the maximum likelihood method. The performance of these methods is investigated in a simulation study for multidimensional tables, as previous studies are limited to 2 by 2 tables. The data are generated under random sampling but also under misspecification models, for which sample and target populations differ systematically. The empirical results show that simple adjustments can lead to more efficient estimators, but generally, at the expense of increased bias. The adjustments also generally improve coverage of the confidence intervals. The methods discussed in this article along with standard error estimators, are made freely available in the R package mipfp. As an illustration, the methods are applied to the 2011 Australian census data available for the Illawarra Region in order to obtain estimates for the desired three-way table for age by sex by family type with known marginal tables for age by sex and for family type.

KW - Census data

KW - count estimation

KW - IPFP

KW - Log-linear model

KW - model-based inference

KW - synthetic population

UR - http://www.scopus.com/inward/record.url?scp=85036530702&partnerID=8YFLogxK

U2 - 10.1515/jos-2017-0048

DO - 10.1515/jos-2017-0048

M3 - Article

AN - SCOPUS:85036530702

SN - 0282-423X

VL - 33

SP - 1021

EP - 1050

JO - Journal of Official Statistics

JF - Journal of Official Statistics

IS - 4

ER -

Estimating Cross-Classified Population Counts of Multidimensional Tables: An Application to Regional Australia to Obtain Pseudo-Census Counts

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this