A fast multivariate nearest neighbour imputation algorithm

N. Solomon; G. Oatley; K. McGarry

Back

A fast multivariate nearest neighbour imputation algorithm

Conference paper

Open access

A fast multivariate nearest neighbour imputation algorithm

N. Solomon, G. Oatley and K. McGarry

2007 International Conference of Computational Statistics and Data Engineering (ICCSDE'07) (London, UK, 02/07/2007–04/07/2007)

2007

Files and links (2)

pdf

Imputation_Algorithm.pdfDownload View

Open Access

url

Conference WebsiteView

Abstract

Imputation of missing data is important in many areas, such as reducing non-response bias in surveys and maintaining medical documentation. Nearest neighbour (NN) imputation algorithms replace the missing values within any particular observation by taking copies of the corresponding known values from the most similar observation found in the dataset. However, when NN algorithms are executed against large multivariate datasets the poor performance (program execution speed) of these algorithms can present major practical problems. We argue that these problems have not been sufficiently addressed, and we present a fast NN imputation algorithm that can employ any method for measuring the similarity between observations. The algorithm has been designed for the imputation of missing values in large multivariate datasets that contain many different missingness patterns with large proportions of missing data. The ideas underpinning the algorithm are explained in detail, and experiments are described which show that the algorithm delivers very good performance when it is used for imputation in both segmented and non-segmented datasets containing several million rows.

Details

Title: A fast multivariate nearest neighbour imputation algorithm
Authors/Creators: N. Solomon (Author/Creator)
G. Oatley (Author/Creator)
K. McGarry (Author/Creator)
Conference: 2007 International Conference of Computational Statistics and Data Engineering (ICCSDE'07) (London, UK, 02/07/2007–04/07/2007)
Identifiers: 991005542965907891
Murdoch Affiliation: Murdoch University
Language: English
Resource Type: Conference paper

Metrics

43 File views/ downloads

66 Record Views