Effectiveness study
The effectiveness of the four therapies was compared in the "intention-to-treat" (ITT) sample giving the clinical effect of the treatment policy. The data contained repeated measurements of the outcome variables. The primary analyses were based on the assumption of ignorable dropouts. In secondary analyses, missing values were replaced by multiple imputation (Rubin 1987). In the case of continuous outcome variables, the statistical analyses were based on linear mixed models (Verbeke and Molenberghs 1997), and in the case of binary outcomes on logistic regression models and generalized estimating equations (GEE, Liang & Zeger 1986). Model-adjusted statistics using predictive margins were calculated for different design points (Lee 1981, Graubard and Korn 1999). For continuous outcomes, absolute means and their differences, and for binary outcomes, prevalences and relative risks/odds ratios, were estimated. The delta method was applied to calculate confidence intervals (Migon and Gamerman 1999). Statistical significance was tested with the Wald test.
Cost-effectiveness analyses were performed using the incremental cost-effectiveness ratio (ICER), which is the difference in the mean costs of the two therapies divided by the difference of their mean effectiveness (Drummond et al. 2005). The effectiveness of the therapies was estimated by calculating the area under the curve (AUC), which is the mean value of the outcome variable during the follow-up period (Pruessner et al. 2003). The confidence intervals for the ICERs were estimated using bootstrap methods. The logarithmic costs were modeled by a linear regression model with the treatment group as the independent variable. Multidimensional sensitivity analyses were performed.
Efficacy approximation
Proxy estimation of efficacy was carried out using "as-treated" (AT) analyses, taking into account compliance and use of auxiliary treatment. Also, Bayesian inference (Gelman et al. 1995) and dynamical models (Eerola et al. 2003, Commenges and Gégout-Petit 2009), which account for the dynamic interdependency between the outcome and auxiliary treatment processes during the follow-up, were applied.
Sufficiency study
The prevalence and incidence of auxiliary treatments were considered indicators of sufficiency of the therapies provided and were used as outcome variables in the effectiveness studies related to sufficiency. The comparison of prevalences of auxiliary treatment was carried out using the same methods as in the effectiveness study. The incidence of auxiliary treatment was modeled using Cox's regression (Cox 1972).
Suitability study
The possible differential prediction of outcome of psychotherapies of different type or length based on certain patient- , therapist-, or therapy-related factors, measured at baseline or during the therapy process, was studied using the same methods as in the effectiveness study, by using an interaction between the therapy group and the factor of interest as a predictor. Since the patients were not randomized with respect to these factors, potential confounding factors need to be adjusted for in the models. The results can be compared with evidence from the current literature using meta-analysis, in which measures of the strength of the association, such as correlation coefficient, relative risk and odds ratio, between the predictor and the outcome are pooled using random effects models (DerSimonian and Laird 1986, Knekt et al. 2004). When prediction based on patient-, therapist- or therapy-related factors has been comprehensively studied, the relative importance of these factors can be compared using a Population Attributable Fraction (PAF) measure, which assesses the proportion of the psychotherapy outcome attributable to different factors (Laaksonen et al. 2010).
Quality control
In the analysis of quality-control data, the strength of agreement between measurements and the repeatability of measurements were estimated as intraclass correlation coefficients using the kappa coefficient in the case of categorical data (Fleiss 1981) and the reliability coefficient in the case of continuous data (Winer 1971).
Program packages
The main statistical analyses were carried out using procedures MIXED, GENMOD and PHREG of the SAS/STAT software, and procedure IML of the SAS/IML software (SAS Institute Inc. 2004). The Bayesian inference was conducted using the WinBugs (Lunn et al. 2000) and OpenBugs (Thomas et al. 2006) software packages.
The quality of the interview data was continuously controlled and evaluated in several separate designs (Knekt and Lindfors 2004). The two primary focuses of the quality-control designs were the evaluation of consistency of the assessments and methodological research, i.e. the evaluation of applicability, comparability, reliability and validity of the methods used and of the new measures developed in HPS. Agreement between raters and long-term stability of the ratings were evaluated in a sample of 39 video-recorded interviews, rated independently by 5 psychologists and 2 psychiatrists at two time points (baseline and 3-year follow-up). Methodological quality-control research comprised several sub-studies and focused on determining agreement between self-reported and interview-assessed psychiatric symptoms, comparing diagnoses based on semi-structured diagnostic interviews (Knekt and Lindfors 2004) and Structured Clinical Interviews for DSM-IV axis I and axis II disorders (SCID) (First et al. 1995, 1997), comparing different methods for computing overall indices of symptoms and functional capacity, assessing quality of proxy outcome assessments (PSQ, Table 1), evaluating reliability between self-rated and register-based information for the use of psychotropic medication, and assessing symptomatic improvement during waiting-time for therapy (Holi et al. 2003).
[2011, Nov 24th.]
Pictures:
► Statistical and... (see more)

► The incremental cost... (see more)

► Quality control of... (see more)
