MORGAM logo

Description and quality of baseline data:
History of coronary heart disease, stroke and diabetes

Kari Kuulasmaa1, Matti Niemelä1 and Sangita Kulathinal1,2 for the MORGAM Project3

1 Department of Health Promotion and Chronic Disease Prevention, National Public Health Institute, Helsinki, Finland
2 Since January 2007 at Indic Society for Education and Development (INSEED), Nashik, India
3 See Annex for the sites and key personnel of contributing MORGAM Centres


Valid HTML 4.01!
© National Institute for Health and Welfare and the MORGAM Project investigators
Last updated: 4 July 2007
For more information, please contact Kari Kuulasmaa (firstname.lastname@thl.fi)

Contents

1. Data items considered

MORGAM collected data on the history of coronary heart disease, stroke and diabetes at the baseline examination. These data were collected by the MPCs using one or several of three procedures: baseline questionnaires, baseline examinations and linkage to disease or hospital discharge registers. The data were transferred to the MDC using the following data items of the Data transfer format - additional baseline data (Form 21):

For data analyses, five derived variables have been defined by the time of preparation of this document:

These data were not collected in the WHO MONICA Project, and they have been post-standardized for MORGAM from all MPCs. The procedures used for collecting these data in each Cohort are described in the Full descriptions of MORGAM Cohorts.

2. Approach to the description and quality assessment

First, the individual data items are assessed by:

Second, data from the different sources are compared to get an insight of the validity of the data on:

Finally, the above-mentioned derived variables are assessed and their usability is discussed.

2.1 Quality scores used for questionnaire items

For the data items based on questionnaire, the Insufficient Data Score (IDS), Questionnaire Comparability Score (QCS), and Questionnaire Summary Score (QSS) defined in the Introduction are used.

2.2 Quality scores used for documented history items

Documented history of disease

For the data items on documented history of disease, the above-mentioned Insufficient Data Score (IDS) and the following scores are used:

a) Cohort Coverage Score (CCS) summarizing the part of the cohort for which documented event history was sought is defined as:

CCS = 2 if documented history was sought for the entire study cohort,
1 if documented history was sought only for those with positive self-reported event history,
0 if documented history was sought only for an unjustifiable part of the study cohort or not sought at all.

b) Geographic Coverage Score (GCS) for sources of notification for documented event history is defined as:

GCS = 2 if sources of notification of documented history cover the whole country,
1 if sources of notification of documented history cover the study area and possibly surrounding areas but not the whole country,
0 if sources of notification of documented history cover an area smaller than the study area.

c) Period Coverage Score (PCS) for sources of notification for documented event history is defined as:

PCS = 2 if sources of notification for documented history cover 10 years or more prior to the baseline examination,
1 if sources of notification for documented history cover 6 years or more but less than 10 years prior to the baseline examination,
0 if sources of notification for documented history cover 5 years or less prior to the baseline examination.

d) Documented history Coverage Score (DCS) is now defined as the minimum of CCS, GCS and PCS.

e) Documented history Summary Score (DSS) is defined as:

DSS = 2 if both DCS and IDS are "2",
1 if either DCS or IDS is "1" but neither of them is "0",
0 if DCS or IDS is "0",
dnp if IDS is "dnp", i.e. if the MPC has not provided data on the item for any member of the cohort.

ECG changes indicating history of myocardial infarction

For the data item HISMI3, ECG changes indicating myocardial infarction, the ECG Comparability Score (ECS) was defined as:

ECS = 2 if the ECG criteria of MORGAM were used,
1 if different criteria were used, but the MORGAM codes can be reasonably derived,
0 if the MORGAM codes cannot be extracted from the locally used ECG criteria,
dnp if ECG was not recorded in the baseline examination.

Documented history Summary Score (DSS) is defined for the baseline ECG data in the same way as the Questionnaire Summary Score for questionnaire items using the ECG Comparability Score and Insufficient Data Score.

3. Assessment of the individual data items

3.1 Distributions of the data items

Hyperlinks to the distributions of the data items are under the respective data item names:

The tables show clearly the Cohorts which have provided data for each of these items. Documented history of MI and Stroke (HISMI1 and HISSTR1) have been provided only for FIN-ATB, FIN-EAS/WES, FRA-LIL/STR/TOU, ITA-FRI/SHE, LTU-KAU and SWE-NSW. Self-reported history of MI (HISMI2) is available for all Cohorts except Cohort 03 of ITA-BRI, Cohort 01 of ITA-ROM and all Cohorts of POL-WAR. Self-reported stroke (HISSTR2) is missing for Cohort 01 of  ITA-ROM. In Cohorts 01, 02 and 21 of RUS-NOV only subsamples were interviewed, and history of stroke is coded missing for all except those who had reported having a stroke. History of unspecified type of coronary heart disease (HISUC) is available for Cohort 03 of ITA-BRI and all cohorts of POL-WAR, reflecting the fact that the question used in these cohorts does not distinguish between MI and stable angina (see below). In addition, data for this item has been reported for FRA-LIL/STR/TOU and for Cohorts 21, 22, 23 and 24 of ITA-ROM although it is apparent that the item is irrelevant for these cohorts (see below). Baseline history of MI based on ECG (HISMI3) has been reported by about a half of the cohorts, and MI based on the Rose questionnaire (HISMI4) by most of the cohorts. Documented history of cardiac revascularization (HISREV1) is available for few cohorts only, but if self-reported data (HISREV2) are also considered, there is information on revascularization for about a half of the cohorts.

Documented history of angina pectoris (HISAP1) is available for few cohorts only, but self-reported history (HISAP3) is available for most of them. History of angina based on the Rose' questionnaire (HISAP2) is available for most cohorts.

All cohorts except Cohorts 01 and 21 of RUS-NOV have reported history of diabetes (HISDIAB). The treatment of diabetes (TREDIAB) is also available for nearly all cohorts, surprisingly including the cohorts of RUS-NOV for which data on the history of diabetes are not available.

The derived variables can be calculated for nearly all cohorts. The exceptions are obvious from the availability of the original items above.

3.2 Assessment of questionnaire-based data items

The questions used locally by the MPCs are described in the Description of MORGAM Cohorts, and the details of the conversion rules from the local question(s) to the MORGAM data items are given in Appendix 2. Table 1 shows the quality assessment scores IDS, QCS and QSS for the data items. For IDS of item TREDIAB, the denominator are those who have reported having diabetes in item HISDIAB (History of diabetes). Below we list the data items and cohorts which have the summary score (QSS) less than two or have other specific comments:

As a summary, the history of MI, stroke and diabetes are available for nearly all cohorts, and the questions used correspond to the MORGAM specification. This does, however, not guarantee that the self-reported data are good, as can be seen in section Comparisons between data items below. For angina pectoris, self-reported data or data derived from the Rose' questionnaire are available for nearly all cohorts, and the questionnaires correspond nearly always to the MORGAM specification. However, there are major problems concerning the validity of the data as can also be seen in section Comparisons between data items below.

3.3 Assessment of the data on documented history items

The procedures used for obtaining these data are described in the Description of MORGAM Cohorts. Table 2 shows the quality assessment scores IDC, CCS, GCS, PCS and DSS for the data items HISMI1, HISREV1, HISAP1 and HISTR1. Table 2 shows also the quality assessment scores IDS, ECS and DSS for the item HISMI3. Below we list the data items and cohorts which have the summary score (DSS) less than two or have other specific comments.

The assessment above is based on availability and the coverage of the search for documentation, but there are also differences in the diagnostic criteria used for the diseases. The source of diagnosis included the hospital discharge diagnosis, MONICA diagnosis, and the PRIME diagnosis. Even within these sources, different diagnostic categories were included for different cohorts, but all of them represent sensible, although not unified, definitions of the diseases. The used diagnostic criteria are described in the cohort descriptions.

In summary, the data on the documented history of diseases are reasonable when available, but it is available only for few cohorts.

4. Comparisons between the data items

4.1 History of MI

In general, we can consider documented history of MI (HISMI1) to be more reliable than self-reported history (HISMI2). The comparison between documented and self-reported history of myocardial infarction tells how well the items measure the same thing, and also gives an idea of how reliable the self-reported history might be in the cohorts where documented history is not available. The percentage of these variables, separately and together, is shown in Table 3. The table also shows the percentage of positive answers to the Rose' questionnaire items on possible MI: "Have you ever had a severe pain across the front of your chest lasting for half an hour or more?"(HISMI4). ECG taken at baseline is a good indicator of the so called Q-wave infarction but it does not reveal milder MIs. Table 3 shows the percentage of MIs revealed through ECG (HISMI3), which were not revealed through the items on documented or self-reported MI. The main findings in Table 3 are:

In summary, it is reasonable to base a general definition of history of MI on documented (HISMI1) and self-reported (HISMI2) MI,  although the comparison between these is not consistent between the cohorts where both data items are available. Further exploration of the differences in Finland, Sweden and Kaunas would be beneficial for the scientific use of these data.

4.2 History of revascularization

Both documented and self-reported history of revascularization (items HISREV1 and HISREV2) are available from ITA-FRI and ITA-FSE only. The comparison of these is shown in Table 4.

4.3 History of angina pectoris

The three data sources for history of stable angina pectoris are documented (HISAP1), Rose' questionnaire (HISAP2) and self-reported (HISAP3). Data on these are shown in Table 5. Data for all three items are available only from FRA-LIL/STR/TOU, where the percentage of documented history is much less than self-reported history, but the self-reported and history that based on Rose' questionnaire are reasonably similar. However, at the individual level there is very little overlap between these two. Documented history is available also from FIN-EAS/WES, where documented angina was only a small fraction of the self-reported history. Both Rose' questionnaire and self-reported data are available also from GER-AUG, ITA-BRI, LTU-KAU, UNK-BEL and Scotland. In all of these (except LTU-KAU and UNK-BEL), the overlap between self-reported angina and angina derived from the Rose' questionnaire was very low.

Overall, MORGAM has very little convincing information about stable angina.

4.4 History of stroke

Like for the history of MI, we can consider documented history of stroke (HISSTR1) to be more reliable than self-reported history (HISSTR2). The percentage of these variables, separately and together, is shown in Table 6. Data for documented history of stroke are available for the cohorts from Finland and France (except FRA-TOU), Friuli, Kaunas, and Northern Sweden. For these cohorts, the main findings are:

In summary, there is remarkable inconsistency in the comparison between documented (HISSTR1) and self-reported (HISSTR2) stroke. Further exploration of the differences in Finland and Sweden would be beneficial for the scientific use of these data.

5. Derived variables

BASEMI1, BASESTR1 and BASECVD1 have been defined as widely available and reasonably reliable indicators of MI and stroke at baseline for being used when excluding baseline cases from analysis of incident events during follow-up and also for identifying baseline cases when these are used as study end-point. Self-reported history was included in these variable because documented history is available from few cohorts only, and self-reported data was thought to be reasonably reliable. However, the Comparisons between data items above indicate that there is inconsistency between the cohorts in the comparison of documented and self reported history. This inconsistency is moderate for MI and more serious for stroke. Therefore, there are reservations for the use of these derived variables in data analysis.

Most of those with documented MI or stroke also had a self-reported MI or stoke. Therefore, the assessments of HISMI2 and HISSTR2 above reflects well also the assessment of BASEMI1, BASESTR1 and BASECVD1.

History of revascularization is also a strong indicator of CHD, and its inclusion should be considered at least when variables are needed for excluding baseline cases from analysis of the data. The usefulness of revascularization is however diminished by the fact that it is available only from about a half of the cohorts. History of angina pectoris would also be of interest, but documented history is available widely from FRA-LIL/STR/TOU and Cohorts 03 and 24 of FIN-EAS/WES. It is available partly from Cohort 02 of FIN-EAS/WES. Self-reported angina is widely available but it has been considered far too unreliable.

BASEMI2 and BASECVD2 are refinements of BASEMI1 and BASECVD1, which differ only for Cohort 03 of ITA-BRI and all cohorts of POL-WAR. For these two cohorts, BASEMI2 and BASECVD2 include unspecified CHD (i.e. MI or angina pectoris), and make the data from these cohorts useful for many analysis where BASEMI1 and BASECVD1 are not applicable to them. The choice between these two sets of variables will have to be done separately for each analysis depending on what is desirable from these two cohorts in the particular analysis.

To summarise, BASEMI1, BASEMI2, BASESTR1, BASECVD1 and BASECVD2 are perhaps the best general variables that can be derived for the previous history of MI, stroke and CVD. However, the information of these variables is not precise, and therefore they should not be used blindly, but thinking of the requirements of the analysis in each case.

6. Discussion

Unlike many baseline data items, the data on history of MI, stroke and diabetes were not collected in the WHO MONICA Project, and post-standardized in MORGAM using a common data transfer format. Nevertheless, most of the local baseline questionnaires had comparable questions on the history of these diseases. MORGAM also collected data on documented MI and stroke, but such data were available only from few MPCs. In these MPCs, nearly all of the subjects with a documented disease also had self-reported disease, but the converse was not true. This suggests that those with a diagnosed MI or stroke event were aware of it, but perhaps many of those who had not had such an event had misunderstood the question, and therefore answered wrongly. Another possible explanation is that these people had had a milder form of cardiovascular disease. It might be possible to assess the discrepancy using the MORGAM follow-up data. Those with cardiovascular disease are known to have a high risk of future MI or stroke. Comparison of the observed risk of those with no self-reported MI or stroke, those with self-reported but not document MI or stroke and those with a documented risk of MI or stroke might give an insight in the background of those with self-reported but not documented MI or stroke.

A recent population-based study in the USA, found a 90% sensitivity and 73% positive predicted value (PPV) for the agreement between self-reported and documented MI. For stroke the values were 78% and 67% respectively [1]. These findings are in line with the MORGAM observations, although the PPVs were generally much lower in MORGAM. The MONICA/KORA Augsburg Project, representing the same Cohorts as GER-AUG in MORGAM, had examined the agreement between self-reported and documented MI during the follow-up contact to the cohort members. Sensitivity and PPV were 98% and 72% respectively. Among the false positive self-reports, the primary diagnosis was CHD in 42%, cardiac procedure in 14%, disease that might cause chest discomfort (heart failure, arrhythmia) in 29% and diseases not concerning the heart in 15% [2]. In a Finnish study, conducted in a similar way as that in Augsburg, sensitivity of 88% and PPV of 74% was found for MI [3]. The values for stroke (including transient ischaemic attack) were 60% and 67%.

A typical use of these data is for excluding the subjects who have CVD at baseline from follow-up analysis. Documented and self-reported MI and stroke together (i.e. derived variable BASECVD2) should be a good exclusion criterion for such a purpose. Another use of the data is to classify those with the disease at baseline as cases. BASEMI1 and BASESTR1 should be a good variables for this, but a more strict criterion includes only documented MI and documented stroke.

Angina pectoris was known to be a more difficult data item, but the observed inconsistency between self-reported angina and angina pectoris derived from the Rose' questionnaire was perhaps bigger than one would have expected. Consequently, it is not easy to find serious use for the data on stable angina pectoris. However, the Finnish validation study mentioned above got a surprisingly high sensitivity of 82% and PPV of 75% for angina pectoris [3].

MORGAM does not provide alternative data sources for the information on history of diabetes. The American study quoted above, reported  sensitivity of 66% and positive predictive value 94% for self-reported diabetes [1]. Therefore, in this study, self-reports indicated a lower prevalence of diabetes than the medical records. Most of the false negative responses reported borderline diabetes, whereas their medical records indicated diabetes. In a literature review, Newell et al. found three studies comparing self-reported and documented diabetes [4]. Sensitivity of self-reports varies between 68% and 80%, and positive predictive value varied between 44% and 76%. In the Finnish validation study, the sensitivity and PPV for diabetes or high blood glucose were 80% and 75% respectively [3]. This experience from the literature suggests that the accuracy of the self-reported data on diabetes is similar to that on MI and stroke.

7. Comments on individual RUAs

The following list summarizes specific findings or exceptional background information relevant for the use of the data:

AUS-NEW:

DEN-GLO:

FIN-ATB:

FIN-EAS/WES:

FRA-LIL/STR/TOU:

GER-AUG:

ITA-BRI/PAM:

ITA-FRI/SHE:

ITA-ROM:

LTU-KAU:

POL-TAR:

POL-WAR:

RUS-NOV:

SWE-NSW:

UNK-BEL:

UNK-CAE:

UNK-EDI/GLA/SHH:

References

  1. Okura Y, Urban LH, Mahoney DW, Jacobsen SJ, Rodeheffer RJ. Agreement between self-report questionnaires and medical record data was substantial for diabetes, hypertension, myocardial infarction and stroke but not for heart failure. J Clin Epidemiol. 2004;57(10):1096-103.
  2. Meisinger C, Schuler A, Lowel H, for the MONICA/KORA Group. Postal questionnaires identified hospitalizations for self-reported acute myocardial infarction. J Clin Epidemiol. 2004;57(9):989-92.
  3. Haapanen N, Miilunpalo S, Pasanen M, Oja P, Vuori I. Agreement between questionnaire data and medical records of chronic diseases in middle-aged and elderly Finnish men and women. Am J Epidemiol. 1997;145(8):762-9.
  4. Newell SA, Girgis A, Sanson-Fisher RW, Savolainen NJ. The accuracy of self-reported health behaviors and risk factors relating to cancer and cardiovascular disease in the general population: a critical review. Am J Prev Med. 1999;17(3):211-29.

Updates to this document

Date Update
2007-07-04 First published version.