Growth Mixture Models in Epidemiology and the Impact of an Incorrectly Specified Random Structure on Model Inferences

Wednesday, 20 August 2014: 11:15 AM
Tubughnenq 4 (Dena'ina Center)
Mark S Gilthorpe, PhD , University of Leeds, Leeds, United Kingdom
Darren L Dahly, PhD , University College Cork, Cork, Ireland
Yu-Kang Tu, PhD , National Taiwan University, Taipei, Taiwan
Laura D Kubzansky, PhD , Harvard School of Public Health, Boston, MA
Elizabeth Goodman, MD , Massachusetts General Hospital, Boston, MA
INTRODUCTION:  Lifecourse trajectories of clinical or anthropological attributes are useful for identifying how our early-life experiences influence later-life morbidity and mortality.  Researchers often use growth mixture models (GMMs) to estimate such phenomena.  It is common to place constrains on the random part of the GMM to improve parsimony or to aid convergence, but this can lead to an autoregressive structure that distorts the nature of the mixtures and subsequent model interpretation.  This is especially true if changes in the outcome within individuals are gradual compared to the magnitude of differences between individuals.  This is not widely appreciated, nor is its impact well understood.

METHODS:  Using repeat measures of body mass index (BMI) for 1528 US adolescents, we estimated GMMs that required variance-covariance constraints in order to attain convergence.  We contrasted constrained models with and without an autocorrelation structure to assess the impact this had on the ideal number of latent classes, their size and composition.  We also evaluated model options through simulation.

RESULTS:  When the GMM variance-covariance structure was constrained, a within-class autocorrelation structure emerged.  When not modeled explicitly, this led to poorer model-fit and models that differed substantially in the ideal number of latent classes, as well as class size and composition.  

CONCLUSIONS:  Failure to consider carefully the random structure of data within a GMM framework may lead to erroneous model inferences, especially for outcomes with greater within-person than between-person homogeneity, such as BMI.  It is crucial to reflect upon the underlying data generation processes when building such models.