Summary of PSMG Meeting, UCLA February 13-15, 2002
The NIMH- and NIDA-sponsored UCLA meeting of the Prevention Science Methodology Group (PSMG) considered methodology needs, data and design needs, research dissemination, and research support mechanisms related to randomized preventive interventions and treatment studies. PSMG is directed by Dr. C Hendricks Brown at the University of South Florida and Dr. Bengt Muthén at UCLA. Methodologists in attendance included Drs. Getachew Dagne, Paul Greenbaum, and Chen-Pin Wang at U South Florida, Booil Jo, Andreas Klein, Klaus Larsen, and Gitta Lubke at UCLA, Alka Indurkhya at Harvard, Mike Stoolmiller at OSLC, Dan Feaster at U Miami, Karen Bandeen-Roche at Johns Hopkins, Joe Schafer, Penn State, and Antonio Morgan-Lopez at Arizona State U.
Our group has developed extensive collaborations with a number of leading prevention researchers who have generously provided their time and their datasets. We gratefully acknowledge the support of Dr. Jane Pearson from the Child and Adolescent Services and Treatment Branch, NIMH and Dr. Elizabeth Roberson, Chief of the Prevention Branch at NIDA, as well as the collaboration of Dr. Speppard Kellam as Director of the American Institutes for Research's Center for Integrating Education and Prevention Research in Schools, Dr. John Reid as Director of the Oregon Prevention Research Center at the Oregon Social Learning Center, Dr. Nick Ialongo as Director of the Prevention Research Center at the Bloomberg School of Public Health at Johns Hopkins, Dr. Ming Tsuang as Director of the Harvard Institute of Psychiatric Epidemiology & Genetics, and Dr. Jose Szapocznik as Director of the Family Research Institute, University of Miami. In addition, we also acknowledge the collaboration and ongoing support of Drs. Tom Dishion, University of Oregon, Jim Snyder, Wichita State U, Charles Martinez, OSLC, George Howe, George Washington U, Hanno Petras, Johns Hopkins, Stephen Faraone, Harvard, Andrew Leuchter, UCLA, and Beth VanFossen, Towson University.
As we have examined issues of etiology (ranging from genetic to environmental factors) as well as prevention and treatment of conduct disorder, drug abuse, delinquency, violence and victimization, depression, ADHD, schizophrenia, we have focused on the following methodology issues:
The following brief summary describes existing methodologies, the need for new developments, and dissemination and research support mechanisms.
I. Existing Methodologies
Following is a summary of some basic issues in the methodological topics that were presented.
1. Growth modeling for understanding individual trajectory patterns
Conventional repeated measures (multilevel, growth model) analysis methods use "random effects" to describe individual heterogeneity in terms of differences in development over time. The analyses may be carried out in a mixed linear model framework, in a multilevel or hierarchical (HLM) framework, or in a latent variable or structural equation model (SEM) framework. Latent variable modeling has the advantage of easily analyzing growth models that condition on initial status, models of mediation, and models for growth in unobserved constructs that have multiple indicators. New latent variable techniques also allow for interactions between latent variables.
The recently developed technique of general growth mixture modeling (Muthén et al., in press) is an extension of conventional growth modeling that has proven valuable for randomized trials. Growth mixture modeling is suitable for describing more fundamental individual heterogeneity of development through different multiple pathways, or latent trajectory classes, with normative and non-normative development that may have different antecedents, correlates, and consequences. Random effects are continuous latent (unobserved) variables and trajectory classes are categorical latent variables, with categorical latent variables corresponding to "mixture" models. Growth mixture modeling incorporates both types of latent variables. A major advantage is that growth mixture modeling responds directly to the question "for whom is the intervention effective" by allowing different intervention effects for individuals in different trajectory classes. Growth mixture modeling also incorporates modeling of non-adherence in trials. Furthermore, growth mixture modeling allows different trajectory classes to experience different dropout patterns. Examples of applications include preventive interventions aimed at reducing classroom aggressive behavior and clinical trials of depression medication in the presence of placebo effects.
2. Multilevel modeling
A typical setting for a preventive intervention or treatment study involves individuals measured repeatedly over time within groups such as classrooms or clinics. The variation across time, individual, and group leads to multilevel modeling (3-level modeling). Observations are correlated across time because they are collected on the same individuals and are correlated within group due to sharing the same group characteristics. Existing multilevel analysis takes into account the resulting non-independence of observations. The analysis studies variation in individual development and variation across groups in individual-level relationships. Examples of applications include growth in child aggression related to neighborhood context.
3. Missing data modeling
Data are typically missing for some individuals at some of the time points in a study. Except for special cases, it is unlikely that data are missing completely at random (MCAR) and current technology considers two more realistic alternatives: missing at random (MAR) and non-ignorable missingness. MAR allows missingness to be predicted by variables that are observed for the individual, for example, baseline background variables. Existing analysis alternatives under MAR include maximum-likelihood estimation and Bayesian analysis using multiple imputations. Certain types of non-ignorable missingness can be studied using a pattern-mixture approach where, for example, individuals in different dropout patterns are allowed to have different parameter values. Other non-ignorable missing data mechanisms can be modeled using multiple imputations.
4. Modeling non-normal and time-to-event outcomes in longitudinal data
In mental health and addiction studies, outcome variables are often not normally distributed continuous variables. Instead, distributions are non-symmetric, skewed, and kurtotic, or categorical with ordered (ordinal) categories. Frequently, a large proportion of the sample is found in the lowest or highest response category. Sometimes, the outcomes are counts for rare events. In other situations, the timing of an event is recorded and the focus is on predicting the time to event, for example time to heroine use following completion of methadone treatment. Extensions of growth models for such outcomes build on random effects logistic regression models, zero-inflated Poisson (ZIP) models, semi-continuous models, and discrete- and continuous-time survival models.
II. The Need for New Developments
Following is a summary of measurement and design issues that need to be studied in future research and related statistical problems.
1. Measurement and Design Issues
Pre-intervention measures. The introduction of growth mixture modeling has an impact on measurement and design choices. To maximally benefit from the possibility to estimate intervention effects that vary across trajectory classes, it is important to have sufficient information before an intervention starts for the purpose of deducing likely class membership. This implies that it is useful to have at least two pre-intervention measurement occasions for the outcome to gain a notion of not only pre-intervention level but also trend. Other baseline information related to the classes is also important.
Participation and adherence measures. As a general term for an individual's level of participation in an intervention, adherence is only observed in those who are exposed to an intervention. Adherence can and often is influenced by self-selection factors. In seeking to understand for whom an intervention works, we often wish to control for such individual differences. This leads us to consider adherence as a potentially observable measure, which would be observed if the individual is assigned to the intervention group. Thus for those randomized to a control setting, adherence can be considered as an always missing variable. We can thus treat adherence as a latent variable in the control group. As such, attempts should be made to measure this latent variable even if this can only be done with error. Researchers may seek indicators of adherence at baseline, asking subjects about their likelihood of participating in the study if invited to the intervention. Context-specific measurements may be considered.
Missing data measures. In the missing data context, the analyses are complicated by non-ignorable missingness in the form of selective dropping out and different dropout rates for treatment and control groups. Here, attempts can be made to measure correlates of the reasons for dropping out of the study, perhaps by asking about the likelihood to stay in the study for one more occasion or by collecting information on background variables that may be related to dropout.
Designed changes in measurement instruments. Psychometrics considers multiple indicators of latent variables, for example an instrument providing teacher ratings of aggressive behavior in the classroom. With longitudinal data, there is a need to focus on "developmental psychometrics", studying items that are and are not sensitive to change over time. For example, an item may be a relevant measure at a certain age, but may indicate something different at another age. In the context of growth modeling of latent variable constructs repeatedly measured by multiple indicators, it is possible to let the same set of items vary across time in their measurement characteristics. What has not been fully utilized is to construct a design that anticipates systematic change of items over time, keeping a sufficient number of stable items the same for adjacent time points in order to study individual change on a well-defined scale.
New types of measures. Other important measures include behavioral observation data, medication information, information on clinical implications, neurobiological measures, and genetic information.
2. Statistical Issues
Multilevel growth mixture modeling. Preventive interventions and treatment studies call for further developments of growth modeling in cluster samples, i.e. three-level (and four-level) modeling extensions. As an example, growth mixture modeling currently exists for only two-level models and cannot fully explore the development of students measured over time within classrooms. General multilevel modeling also needs to allow for interactions involving latent variables. For example, aggressive behavior development of a student may be influenced by the initial status of the growth process, a latent variable, and this individual-level relationship may differ across different degrees of classroom-level aggression. This is a random slopes model of a new kind, representing an interaction involving both latent variables and variables observed on different levels.
Treatment studies. Multilevel extensions of growth mixture modeling are also useful in treatment studies, for example, with patients nested within physicians, clinics, or sites. Treatment studies present their own growth modeling challenges with staged treatments, taking into account factors such as patient and physician preferences.
Diagnostics for growth mixture models. The sophisticated growth mixture modeling approach relies on the construction of models that accurately reflect the data. Because these methods rely on unobserved or latent variables and classes, the examination of model fit is not straightforward. We have used a number of diagnostic methods to examine quality of fit of our models, including graphical methods based on empirical Bayes residuals and pseudoclasses.
Non-hierarchical data. Multilevel extensions of growth mixture modeling are also needed for data with crossed random effects and with changing cluster memberships as where students move out of a school. Theories of mesosystems where different contextual systems interact also require new methods. Multilevel extensions of adherence modeling are needed for cluster-randomized studies.
Non-ignorable missing data. Non-ignorable missing data is probably the most common form of missing data, but its methodology is the least developed. This area needs considerable further study. For example, multilevel extensions of missing data modeling is needed for the common complication of differential attrition among treatment and control groups and among those who adhere and those who do not.
Low baserate disorders. Special methods are needed to examine preventive effect on low baserate disorders. There have been a number of important "finite sample" methods, such as Fisher's exact test, applied to data with small cell counts. These methods can be extended to situations where growth trajectories can be captured using latent variables and latent classes, as in the general growth mixture model described above. Exact tests then provide us with better testing of rare outcomes in these growth models. There is a more general problem, however. When outcomes are rare, the statistical power for examining preventive effect in a single study is often low, necessitating the combination of data across different trials. We are developing new methods to examine the combined impact and variation in impact of preventive interventions on completed suicide.
Information-intensive longitudinal data. While many of our intervention trials collect repeated data on measures across an extensive part of the life course, some of the most important—and complex data are collected over comparatively short periods. Micro-analytic behavioral observation data are used to study the interaction patterns between children and parents, teachers, or peers. Such data are critical in examining the immediate impact of an intervention; unlike the usual data on ratings by self, parents, teachers, or peers, all of whom participate in the interventions, behavioral observation data is far less sensitive to bias. So far researchers have not used the full power in behavioral observation data, often relying on z-test indices of individual departure from independence that have poor statistical properties. We have developed a number of methods that summarize a dyad's interactions into a random effect measure that can be used as either a predictor or response variable. Many methodological questions remain including how to take into account event duration, use of longer sequences of behavior observation data, and design choices in collecting behavioral observation data. New methodologies are likely to draw on techniques from growth modeling, survival analysis, and latent variable techniques.
Genetic data. Longitudinal studies benefit from the inclusion of genetic information, for example, in studies of ADHD and alcohol problems. Here, new microarray expression data pose challenges for summarizing genetic information and relating it to the outcome in question. Analysis of microarray data also involves latent variable modeling, such as the McLachlan factor mixture analyzer.
III. Dissemination and Research Support Mechanisms
Methods dissemination and research support mechanisms were also discussed in the meeting.
Many of the new existing methodologies discussed in the meeting have had little exposure among substantive researchers. Relatively few use new methods such as growth mixture modeling and methods for low baserate disorders in substantive journals. Few didactic methods articles exist that explain the methods in practice. Such methods explications are necessary for the field to adopt the methods and should be published in appropriate journals read by prevention and treatment researchers.
2. Grants and Contracts
Research support is needed for methodology development that is closely connected with substantive research in preventive intervention and treatment studies. Funding sources are available for both purely statistical research and substantive studies, but what is needed is 1) support for methods research integrated with substantive work, and 2) training grants that bring together substantive researchers with methodologists around a common problem. These strategies are particularly important in light of a common dual problem. One the one hand, substantive research groups are in strong need of analytic help from individuals willing to immerse themselves in substantive problems. On the other hand, early-career methodologists need funding to support methods development that is inspired by substantive problems. To address these problems, different funding mechanisms should be explored, including post-doctoral positions, grant supplements, collaborative R01's, program projects, cooperative agreements, multidisciplinary centers, training grants, R03's, and K awards.
In addition to regular grants, the Small Business Innovation Research (SBIR) grant and contract mechanisms may be useful to develop and disseminate methodology. For example, SBIR's are useful to support the necessary software development of the new methods. SBIR's can also be useful for supporting efforts to disseminate new methods through much needed books describing computer analyses in practice and through web-based methods courses.
Some Recent PSMG References
Brown CH (2003). Design principles and their application in preventive field trials. In WJ Bukoski and Z Sloboda, Handbook of Drug Abuse Theory, Science, and Practice. New York: Plenum Press.
Brown CH and Faraone SV (in press). Prevention of Schizophrenia and Psychotic Behavior: Definitions and Methodologic Issues. To appear n WS Stone, SV Faraone, and MT Tsuang (Eds.), Early Clinical Intervention and Prevention in Schizophrenia, The Humana Press.
Carlin JB, Wolfe R, Brown CH, and Gelman A (in press). A case study on the choice, interpretation, and checking of multilevel models for longitudinal, binary outcomes. Accepted for publication in Biostatistics.
Dagne GA, Brown CH, and Howe GW (in press). Bayesian hierarchical modeling of heterogeneity in multiple contingency tables: An application to behavioral observation data. To appear in Journal of Educational and Behavioral Statistics.
Dagne GA, Howe GW, Brown CH, and Muthen BO (in press). Hierarchical modeling of sequential behavioral data: An Empirical Bayesian Approach. To appear in Psychological Methods.
Dishion TJ, Kavanagh K, Schneiger A, Nelson S, and Kaufman NK (2002). Preventing early adolescent substance use: A family-centered strategy for public middle school. Journal of Prevention Science, 3, 191-201.
Faraone SV, Brown CH, Glatt SJ, Tsuang MT (2002). Preventing schizophrenia and psychotic behaviour: definitions and methodological issues. Canadian Journal of Psychiatry, 47, 527-37.
Gluck J and Indurkhya A (2001). Assessing Changes in the Longitudinal Salience of Items Within Constructs. Journal of Adolescent Research, 16, 169-187.
Indurkhya A, Zayas, LH, and Buka, SL (in press). Sample size estimates for inter-rater agreement studies. To appear in Methods for Psychological Research.
Khoo, ST (2002). Assessing program effects in the presence of treatment-baseline interactions: A latent variable approach. Psychological Methods.
Muthen BO (2001). Beyond SEM: General Latent Variable Modeling. Behaviormetrika, 29, 81-117.
Muthén, B. (2002). Statistical and substantive checking in growth mixture modeling. Psychological Methods.
Muthén BO, Jo B, and Brown CH (in press). Assessment of treatment effects using latent variable modeling: Comments on the New York School Choice Study. Accepted for publication in Journal of the American Statistical Association.
Muthén, B., Brown, C.H., Masyn, K., Jo, B., Khoo, S.T., Yang, C.C., Wang, C.P., Kellam, S., Carlin, J., & Liao, J. (in press). General growth mixture modeling for randomized preventive interventions. Forthcoming in Biostatistics.
Snyder, J., & Stoolmiller, M. (2002). Reinforcement and coercion mechanisms in the development of antisocial behavior: The family. In J. B. Reid, G. R. Patterson, & J. Snyder (Eds.), Antisocial behavior in children and adolescents: A developmental analysis and model for intervention (pp. 65-100). Washington, DC:American Psychological Association.