This document discusses approaches for handling missing data in statistical analyses. It begins by contrasting an ideal scenario with no missing data against the real-world problem of missing data. Common types of missing data are defined, including missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). Complete case analysis is described as biased except for large samples with little missing data. Alternative approaches like multiple imputation are presented along with their advantages and disadvantages. An example using indicator variables to handle missing clinic data in a regression is provided. Issues like creating indicators for each missing covariate are also noted.
Related topics: