WWW-publications from the WHO MONICA Project

Quality Assessment of Data on Marital Status and Educational Achievement in the WHO MONICA Project

December 1998

Anu Molarius1, Kari Kuulasmaa1, Vladislav Moltchanov1 and Marco Ferrario2 for the WHO MONICA Project3

1 MONICA Data Centre, National Public Health Institute, Helsinki, Finland
2 Research Centre for Chronic Degenerative Diseases, Institute of Biomedical Sciences San Gerardo, University of Milan, Milan, Italy
3 Annex: Sites and key personnel of the WHO MONICA Project


© Copyright World Health Organization (WHO) and the WHO MONICA Project investigators 1999. All rights reserved.

This document includes the main findings of unpublished reports:


Acknowledgements

Thanks are due to Alun Evans who commented on the text.

The MONICA Centres are funded predominantly by regional and national governments, research councils, and research charities. Coordination is the responsibility of the World Health Organization (WHO), assisted by local fund raising for congresses and workshops. WHO also supports the MONICA Data Centre (MDC) in Helsinki. Not covered by this general description is the ongoing generous support of the MDC by the National Public Health Institute of Finland, and a contribution to WHO from the National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA for support of the MDC. The completion of the MONICA Project is generously assisted through a Concerted Action Grant from the European Community. Likewise appreciated are grants from ASTRA Hässle AB, Sweden, Hoechst AG, Germany, Hoffmann-La Roche AG, Switzerland, the Institut de Recherches Internationales Servier (IRIS), France, and Merck & Co. Inc., New Jersey, USA, to support data analysis and preparation of publications.


MONICA items considered in this document

MARIT
Marital status
EDLEVEL
Level of education
SCHOOL
Years of schooling

Contents

1. Introduction

Education and marital status are the only social variables included in the core data of the WHO MONICA Project population surveys. MONICA collects questionnaire data on the subject's marital status, highest educational level completed and number of years of schooling. One of the main uses of these data items is to serve as stratifying variables for the other survey variables when describing socio-economic differences in the risk factor levels, their trends and associations between risk factors as well as using them as covariates in the MONICA data analyses.

The initial survey quality assessment of data on marital status and educational achievement was distributed in January 1994 (4) and the middle survey quality assessment in May 1996 (5). The present report gives an assessment of the data quality in all three MONICA surveys by repeating the main findings of the earlier quality assessment reports and assessing the quality of these data in the final MONICA survey. The report presents the current knowledge of the questions asked about marital status and educational achievement, the procedures used by the MONICA Collaborating Centres (MCCs) to fulfill the standardization criteria and the quality of the data available in the MONICA Data Centre (MDC). Several MCCs responded to the initial and middle survey quality assessment reports, giving their comments and clarifications. Such comments have been taken into account in preparing this report.

The quality assessment of the initial survey revealed problems of comparability of educational achievement between populations and between different age groups within populations. These problems arise because of apparent differences in the meanings of the educational categories between the populations, and changes in the educational systems within the populations. In addition, in the middle survey quality assessment report changes in the survey questionnaire referring to educational level were observed within several populations between the initial and middle survey. At the time of the initial survey quality assessment report, there were no good solutions available to these problems. Since then a reasonably satisfactory approach, based on tertiles of years of schooling, has been developed. This approach was described in detail in the middle survey quality assessment report (5) and will be repeated in this report.

2. Material and methods

2.1 Populations

The results of this document are reported by Reporting Unit Aggregates (RUA) which are potential units of analyses of the MONICA data. The RUAs, their abbreviations and Reporting Units are listed in Table 1. Some of the RUAs have several versions because different combinations of Reporting Units (RU) may be used for cross-sectional and trend analyses if all RUs of the population were not included in some of the surveys. Therefore, in AUS-PER, GER-BRE, GER-EGE, GER-KMS, GER-RDM, RUS-MOI and RUS-NOC there is an overlap of RUs included in the RUAs in some surveys. The RUAs are identified by the abbreviation and a version letter. For UNK-GLA which carried out four surveys, the first (initial), third (middle) and fourth (final) survey are considered.

In contrast to other survey quality assessment reports, GER-EGE, GER-KMS and GER-RDM have been split into smaller RUAs in the current document because of the use of different questionnaires in the different RUs. Altogether 57 RUAs are considered for the initial MONICA survey, 45 for the middle survey, and 42 for the final survey.

2.2 Sources of information

The quality assessment is based on the following information:

2.3 Age and sex

For the quality analyses all observations within the age group 25-64 were used, except in FRA-LILa in the final survey and in AUS-NEWa, BEL-LUXa, FRA-STRa, FRA-TOUa, LTU-KAUa, POL-TARa, POL-WARa, RUS-MOCa, RUS-MOIa, RUS-MOIb and SWI-TICa where the age range studied was 35-64. Age was defined as age in full years on the date of examination (see DEF1 in reference 1). In tables 6-7 the age range was exceptionally 35-64 years. Tables 7-10 present the results separately for men and women. No weighting was applied to the data.

3. Assessment of the survey questionnaires

The standard questions and response categories for marital status and education are recommended in the MONICA Manual (Dec 1986, Nov 1990 and March 1992)(2). Marital status is referred to in item MARIT of the data transfer format (Form 04, versions 3,6,7), educational level in item EDLEVEL and years of schooling in item SCHOOL:

MARIT Marital status
1 = single
2 = married or cohabitating
3 = separated or divorced
4 = widowed
5 = other
9 = insufficient data

Indicate marital status as follows:

1 = Single, for persons who have never been married.
2 = Married or cohabiting, for persons currently married or cohabitating.
3 and 4 are for currently single persons who were previously married.
If code 5 is used the MCC must prepare a manual list of the options specified.
Code 9 for insufficient data.

EDLEVEL "What is the highest level of education you have completed?"
1 = university or college or equivalent
2 = intermediate between secondary level and university (e.g. technical training)
3 = secondary school
4 = primary school only (or less)
9 = insufficient data

Codes 1, 2 and 3 are self-explanatory. The lowest code should take precedence.
Code 4 if primary school only or if less than primary school.
Code 9 for insufficient data.

SCHOOL "How many years have you spent at school or in full time study?"
99 = insufficient data

Code the number of years. For the years 1 - 9 use codes 01 - 09.
Code 99 for insufficient data.

There has been no change in the recommended questions or the instructions for coding of these items during the MONICA monitoring period. The MONICA Manual includes no specific instructions as to whether these questions should be asked in interviews or by self-administered questionnaires. In the instructions for the transfer of the item MARIT it is added that "single" (code 1) should be used for persons who have never been married, "married or cohabitating" (code 2) for persons "currently married or cohabitating", and code 3 and 4 for "currently single persons who were previously married". No additional instructions are given for the item EDLEVEL (with the exception that the lowest code should have priority) and for the item SCHOOL.

3.1 Availability of data

Table 2 gives an overview on whether and how the data on items concerning marital status and educational achievement were gathered. The table is based on the original questionnaires used in the MCCs or on their English translations which were available in the MDC and on the Survey Procedures Questionnaire (Form VI).

In the initial survey, data on MARIT were not collected in nine RUAs (CAN-HAL, GER-COT, GER-EGEd, GER-ERF, GER-KMSc, GER-RDMe, HUN-BUD, HUN-PEC and SWE-GOT). Data on EDLEVEL were not collected in nine RUAs (GER-COT, GER-EGEd, GER-ERF, GER-KMSc, GER-RDMe, HUN-BUD, HUN-PEC, NEZ-AUC and SWE-GOT). Data on SCHOOL were not collected in 13 RUAs (AUS-NEW, AUS-PERa, CAN-HAL, GER-COT, GER-EGEd, GER-ERF, GER-KMSc, GER-RDMe, HUN-BUD, HUN-PEC, LTU-KAU, NEZ-AUC and SWE-GOT). In USA-STA, the level of education was derived from years of schooling. DEN-GLO derived years of schooling from two questions on educational level. GER-RHN has informed that data on items MARIT and SCHOOL were collected since 1987 (for about one third of the respondents), but the data have not been received in the MDC.

The availability of data improved considerably to the middle survey. Only one RUA (HUN-PEC) did not measure marital status, educational level or years of schooling in the middle survey. Three RUAs (AUS-NEW, AUS-PERa, AUS-PERb) did not ask about years of schooling. AUS-NEW had a question about years of schooling, but the response was coded into five categories so that data for the item SCHOOL could not be derived from the answers.

In the final survey, all RUAs measured marital status and level of education. In NEZ-AUC the level of education was inferred from several questions and in USA-STA it was derived from years of schooling, as in the initial and middle surveys. Four RUAs (AUS-PERa, AUS-PERb, GER-ERF, NEZ-AUC) did not ask about years of schooling.

The RUAs did not change their ways of administering the survey questionnaire. The questionnaire was self-administered in 15 RUAs and administered by an interviewer in 41 RUAs. One RUA used a combination of an interview-mediated and self-administered questionnaire.

3.2 Correspondence of the questions and answer categories

3.2.1 Marital status

Table 3 shows whether the item on marital status in the data transferred to the MDC in the initial survey corresponded exactly to the question in the original questionnaire or whether the answers were recoded, if necessary, for the data transfer. The table also indicates whether there was a change in the question between the initial and the middle surveys, and between the middle and final surveys.

In the initial survey, there were discrepancies in the question on marital status in more than a half of the RUAs when compared to the data transfer format. The categories most frequently omitted were "cohabitating" (AUS-NEW, AUS-PERa, GER-BREa, ISR-TEL, LTU-KAU, MLT-MLT, RUS-MOC, RUS-MOIa, RUS-MOIb, RUS-NOCa, RUS-NOCb, RUS-NOI, UNK-GLA, USA-STA), "separated" (CHN-BEI, ICE-ICE, ISR-TEL, LTU-KAU, RUS-NOCa, RUS-NOCb, RUS-NOI) and "other" (AUS-NEW, AUS-PERa, FIN-KUO, FIN-NKA, FIN-TUL, FRA-STR, GER-AUR, GER-AUU, GER-BREa, GER-RHN, ISR-TEL, MLT-MLT, NEZ-AUC, RUS-MOC, RUS-MOIa, RUS-MOIb, RUS-NOCa, RUS-NOCb, RUS-NOI, SWI-TIC, SWI-VAF, UNK-GLA, USA-STA). POL-TAR and POL-WAR coded "cohabitating" as "married" , and "divorced" and "separated" were coded as "other" in the MCC but recoded for transfer. In CZE-CZE "separated" was unusual and was coded as "divorced". In UNK-BEL the categories were not specified on the questionnaire but taken from a separate coding list. DEN-GLO combined marital status from answers to two questions. Several RUAs used more answer categories than required (BEL-CHA, BEL-GHE, GER-AUR, GER-AUU, NEZ-AUC, POL-TAR, POL-WAR, RUS-NOC, RUS-NOI, SWE-NSW, UNK-GLA) and combined them afterwards for data transfer. The way in which the categories were combined for the data transfer is unknown for GER-AUR, GER-AUU, NEZ-AUC and SWE-NSW.

Some RUAs changed their question on marital status between the initial and middle survey, mainly by adding the option "cohabitating" to the category for married (AUS-NEW, RUS-MOC, RUS-MOIa, UNK-GLA). ICE-ICE added "separated" to the category for divorced, and RUS-NOCa and RUS-NOI included the option "other". RUS-MOC and RUS-MOIa excluded "separated" from their questionnaire. SWE-NSW combined the extra categories used in the initial survey to correspond to the MONICA standard, and YUG-NOS omitted the option "other". UNK-BEL specified the answer categories in the questionnaire. Six RUAs (GER-COT, GER-EGEd, GER-ERF, GER-KMSc, HUN-BUD, SWE-GOT) which did not ask MARIT in the initial survey, added the question in the standard format to their questionnaire.

Only a few RUAs changed their question on marital status to the final survey. AUS-PERa, AUS-PERb, GER-BREa and GER-BREb added "cohabitating" to the category for married, CHN-BEI added "separated" and FRA-STR added the option "other". GER-ERF excluded "cohabitating", "separated" and "other", and AUS-NEW omitted "cohabitating". CAN-HAL, which did not ask marital status in the initial survey, added the question in the final survey.

3.2.2 Educational level

Many RUAs had more answer categories for educational level than required by the MONICA transfer format (Table 4). In the initial survey information about the specific recoding procedures used for combining those categories for data transfer are missing for several RUAs (DEN-GLO, FRA-STR, FRA-TOU, GER-AUR, GER-AUU, GER-BREa, GER-RHN, ISR-TEL, ITA-LAT, RUS-MOC, RUS-MOIa, RUS-MOIb, RUS-NOCa, RUS-NOCb, RUS-NOI, SPA-CAT). As observed in (4) and (5), the categories for educational level did obviously not have the same meaning in every MCC. CHN-BEI used the term "professional" instead of "intermediate" and several RUAs replaced "intermediate" by "technical college" (GER-BER, GER-EGEc, GER-HAC, GER-KMSd, GER-RDMf). FIN-KUO, FIN-NKA and FIN-TUL used the term "vocational" instead of "secondary school", and also in GER-BER, GER-EGEc, GER-HAC, GER-KMSd and GER-RDMf the definition of secondary school deviated from the standard. DEN-GLO combined two questions to derive educational level and USA-STA derived the item directly from the question for years of schooling. SWI-TIC and SWI-VAF used an open question for educational level. In three RUAs (AUS-NEW, AUS-PERa, POL-WAR), the intermediate category was omitted from the questionnaire.

In several RUAs there was a change in the question concerning educational level between the initial and middle survey. GER-AUU, GER-AUR and YUG-NOS added the option "no education" and AUS-NEW added an option for "technical college". Likewise, some RUAs reduced the number of options. RUS-NOCa and RUS-NOI excluded one extra option, and RUS-MOC and RUS-MOIa combined additional categories into the MONICA standard. The RUAs of MONICA East Germany which asked educational level in the initial survey (GER-BER, GER-EGEc, GER-HAC, GER-KMSd) converted the option for secondary school to correspond to the MONICA standard. UNK-BEL introduced a category with combined elementary and secondary school levels (coded as "secondary school") and excluded the option "primary school only".

Several RUAs also changed their question in the final survey. CAN-HAL, FRA-LIL, FRA-STR, FRA-TOU and GER-ERF used more answer categories. CZE-CZE added two options and LTU-KAU added "not completed university". UNK-BEL added a category for primary school only, and RUS-MOC and RUS-MOIa dropped the category for primary school. In NEZ-AUC, level of education was combined from several questions.

3.2.3 Years of schooling

The question regarding years of schooling appears less complicated than the question regarding level of education (Table 5). The only major discrepancy was that a number of RUAs did not explicitly use the term "full time study" in their questionnaire (BEL-CHA, BEL-GHE, CZE-CZE, ISR-TEL, POL-TAR, POL-WAR, RUS-MOC, RUS-MOIa, RUS-MOIb, RUS-NOCa, RUS-NOCb, RUS-NOI, YUG-NOS). Several of them did, however, ask for years spent at school, high school and university which usually implies full time study (RUS-MOC, RUS-MOIa, RUS-MOIb). Also, we do not always know whether "full time" study included business schools or professional training. USA-STA asked for "the highest year of formal education". GER-BREa calculated years of schooling from educational level and DEN-GLO derived years of schooling from the two categorical variables which were used to define educational level. FRA-STR asked about the duration of full time studies since the age of 14 years, but recoded the data for transfer. In CHN-BEI the exact wording of the question is not clear.

Apart from the RUAs which added the question to their middle survey questionnaire (GER-COT, GER-EGEd, GER-ERF, GER-KMSc, HUN-BUD, LTU-KAU, SWE-GOT) there were no significant changes in the questionnaires regarding years of schooling between the initial and middle surveys. Only RUS-NOCa and RUS-NOI included "full time study" in their questionnaire.

In the final survey, AUS-NEW and CAN-HAL added the question on years of schooling to their questionnaire, and GER-ERF dropped it. In FRA-STR and CHN-BEI the question corresponded to the MONICA standard.

4. Data-based assessment of marital status and education

4.1 Routine data checking

The MDC checks all population survey core data received from the MCCs at the time they are included in the MONICA database. The following constraints concern marital status and education:

MARIT_LIMITS_4
Accepted values for MARIT are 1,2,3,4,5 and 9.
EDLEVEL_LIMITS_4
Accepted values for EDLEVEL are 1,2,3,4 and 9.
EDLEVEL_SCHOOL_4
If 0 < SCHOOL < 4 then EDLEVEL = 4 or 9.
If EDLEVEL = 1 then SCHOOL=99 or 10 < SCHOOL < 30.
SCHOOL_LIMITS_4
If SCHOOL < 99 then 0 < SCHOOL < 30.

All violations of these constraints were reported to the MCC for their correction or confirmation. Data values outside the constraint limits were acceptable, but the MCC had to check that the values were not due to data errors. The MCCs were asked to correct values only if they knew that they were incorrect. The currently unresolved constraint violations concerning data on marital status and education are listed by survey in Appendix 1. There are only a few unresolved constraint violations.

4.2 Missing data

The information on missing data on marital status and education (Table 6) is based on the data currently available in the MDC. The table shows that missing data were not a major problem in these MONICA surveys.

Marital status: ITA-LAT had 15% missing data in the initial survey and SWE-GOT 10% in the final survey. With the exception of those RUAs which did not measure this item, all other RUAs had 2% or less of values missing.

Educational level: ITA-LAT had 15% of data missing in the initial survey and SWE-GOT 13% in the middle survey. Apart from the RUAs which did not ask this question, all other RUAs had less than 10% of data missing for this item.

Years of schooling: High proportions of missing data were observed in ITA-LAT (66%) and RUS-MOIb (29%) in the initial survey. The high proportion of missing data in RUS-MOIb is due to the fact that SCHOOL was missing for more than half of the respondents in RU 3. SWE-GOT had 12% and 16% of missing values in the middle and final survey, respectively. Apart from the RUAs which did not ask this question, all other RUAs had less than 10% of data missing.

Large changes in the proportion of missing data between surveys could cause quality problems for trend analyses. Excluding those RUAs which added the questions on marital status and/or education in their middle or final survey, there was a more than 5% change in missing data between surveys in three RUAs only. In FIN-TUL the proportion of missing data on years of schooling decreased from 6% in the middle survey to 0% in the final survey, in FRA-LIL the proportion of missing data on educational level decreased from 7% in the initial survey to 0% in the final survey, and in SWE-GOT the proportion of missing data on marital status increased from 1% in the middle survey to 10% in the final survey.

4.3 Data-based assessment of educational level

To gain insight into the cross-cultural comparability of the different educational levels and changes within RUAs between the surveys, the association between individual years of schooling and educational level was analyzed in each RUA. The analysis also intended to detect peculiarities or contradictions in the distributions of high and low educational levels in the RUAs and changes between the surveys.

Table 7 shows the age-standardized proportions of educational levels and mean years of schooling (and standard deviation) by educational level for ages 35-64 in the three surveys for men and women by RUA. There were considerable variation in the years of schooling by level of education across RUAs in all surveys.

In the initial survey, the mean years of schooling ranged from 4.2 years (CHN-BEI women) to 11.5 years (GER-BREa men) at the primary school level. At the secondary school level, the mean years of schooling ranged from 7.6 years (ITA-LAT women) to 14.6 years (GER-BER men), at the intermediate level from 10.2 years (ISR-TEL women) to 16.1 years (BEL-GHE men), and at the university level from 9.5 years (ITA-LAT women) to 21.8 years (SPA-CAT women). The mean years of schooling in women in ITA-LAT at the university level is strikingly low suggesting a coding error. Extremely low proportions of persons with primary school level only were found in UNK-BEL and UNK-GLA (9% and 0% of men), whereas more than 70% of men in FIN-KUO, FIN-NKA, GER-AUR, GER-AUU, GER-BREa, GER-RDMf and SPA-CAT had only primary school education.

In the middle survey, the mean years of schooling ranged from 3.7 years (CHN-BEI women) to 11.3 years (GER-AUR and GER-AUU men) at the primary school level. At the secondary school level, the mean years of schooling ranged from 8.1 years (ITA-BRI women) to 14.1 years (CZE-CZE men), at intermediate level from 11.3 years (GER-KMSc women) to 16.8 years (GER-AUR men), and at university level from 14.4 years (GER-EGEd men and GER-ERF women) to 20.1 years (SPA-CAT women).

More than a 10% change in the distribution of proportions of educational level between initial and middle survey was observed in a large number of RUAs (AUS-NEW, AUS-PERa, CHN-BEI, DEN-GLO, GER-BER, GER-EGEc, GER-HAC, GER-KMSd, ICE-ICE, LTU-KAU, POL-TAR, RUS-MOC, RUS-MOIa, RUS-NOCa, RUS-NOI, SWE-NSW for men and AUS-PERa, CHN-BEI, DEN-GLO, GER-BER, GER-BREa, GER-EGEc, GER-HAC, GER-KMSd, LTU-KAU, RUS-MOC, RUS-MOIa, RUS-NOCa, RUS-NOI, SWE-NSW and SWI-TIC for women). For some RUAs this was due to a change in the questionnaire or in the coding practises. AUS-NEW added the the option for intermediate level to the middle survey questionnaire. It seems that what LTU-KAU coded intermediate in the initial survey was coded secondary school in the middle survey. GER-BER, GER-EGEc, GER-HAC and GER-KMSd changed the definition of secondary school level. For the rest of the RUAs the reason for the change in the distribution is not known.

In the final survey, the mean years of schooling ranged from 3.2 years (AUS-NEW women) to 14.6 years (UNK-BEL women) at the primary school level. At the secondary school level, the mean years of schooling ranged from 8.2 years (ITA-BRI men and women) to 13.9 years (CZE-CZE men), at intermediate level from 11.4 years (SWE-NSW women) to 16.6 years (GER-AUR men), and at university level from 15.1 years (CHN-BEI and SWE-NSW women) to 19.0 years (ICE-ICE men and ITA-FRI men). The mean years of schooling in women in UNK-BEL at the primary school level is extremely high suggesting a coding error.

More than a 10% change in the distribution of proportions of educational level between middle and final survey was observed in 16 RUAs (BEL-CHA, BEL-GHE, CAN-HAL (between initial [Ini] and final [Fin]), CHN-BEI, DEN-GLO, FRA-LIL (between Ini and Fin), FRA-STR (between Ini and Fin), FRA-TOU, GER-EGEd, ICE-ICE, ITA-FRI, RUS-MOC, RUS-MOIa, RUS-NOCa, RUS-NOI, SWE-GOT) among men and in 21 RUAs (BEL-CHA, BEL-GHE, CHN-BEI, DEN-GLO, FIN-KUO, FIN-NKA, FRA-LIL (between Ini and Fin), FRA-STR (between Ini and Fin), FRA-TOU (between Ini and Fin), GER-EGEc, GER-EGEd, GER-ERF, ITA-FRI, RUS-MOC, RUS-MOIa, RUS-NOCa, RUS-NOI, SWE-GOT, SWE-NSW, SWI-VAF, YUG-NOS) among women. In CAN-HAL, FRA-LIL, FRA-STR, FRA-TOU, GER-ERF, RUS-MOC and RUS-MOIa there was a change in the questionnaire. For the rest of the RUAs the reason for the change in the distribution is not known.

The mean years of schooling by educational level was more stable than the distribution of educational level itself. More than a one year change in the mean years of schooling for at least one educational level was observed in 13 RUAs (BEL-CHA, BEL-GHE, CZE-CZE, GER-AUR, GER-BER, GER-EGEc, GER-HAC, GER-KMSd, POL-TAR, RUS-MOC, RUS-MOIa, SPA-CAT, SWE-NSW) in men and in 17 RUAs (BEL-CHA, FIN-KUO, GER-AUR, GER-AUU, GER-BER, GER-EGEc, GER-HAC, GER-KMSd, RUS-MOC, RUS-MOIa, RUS-NOCa, RUS-NOI, SPA-CAT, SWE-NSW, SWI-TIC, SWI-VAF, UNK-BEL) in women between the initial and middle survey. More than one year change was observed between the middle and final survey in 12 RUAs (BEL-GHE, CZE-CZE, GER-EGEc, GER-EGEd, ICE-ICE, RUS-MOIa, RUS-NOCa, RUS-NOI, SPA-CAT, SWE-GOT, UNK-BEL and UNK-GLA) in men and in 17 RUAs (BEL-CHA, FIN-KUO, FRA-LIL (between Ini and Fin), FRA-STR (between Ini and Fin), FRA-TOU (between Ini and Fin), GER-EGEc, GER-EGEd, ICE-ICE, LTU-KAU, RUS-MOC, RUS-MOIa, RUS-NOCa, SPA-CAT, SWE-GOT, SWI-TIC, UNK-BEL and UNK-GLA) in women. More than a one year change in the mean years of schooling by educational level is usually statistically significant. Such a change can be due to a change in the questionnaire, a change in the coding procedures or a change in the educational system over time.

4.4 Within cohort trends in years of schooling

As reported above, changes in age-standardized mean years of schooling by educational level occurred in several RUAs between surveys. This can sometimes be explained by changes in educational systems and therefore by differences in education between birth cohorts. The mean years of schooling should, however, be relatively stable within a birth cohort since most of the subjects have passed the age of receiving formal education. If the questionnaires and the data collection procedures did not change between the surveys, there should be no essential changes in the distribution of years of schooling within birth cohorts.

We investigated the stability of years of schooling by calculating the differences in mean years of schooling in 10-year birth cohorts (i.e. people born during the same 10-year period) between the surveys. Ten-year birth cohorts were defined by the years of birth corresponding closest to the 10-year age groups 25-34, 35-44 and 45-54 in the middle of the initial survey in each RUA. Table 8 gives the differences in mean years of schooling between the three surveys by these birth cohorts. DEN-GLO was excluded from this analysis because it surveyed men and women of ages 30, 40, 50 and 60 years and did not therefore examine the same birth cohorts in the three surveys. Also, AUS-PERb and GER-BREb, which did not do the initial survey, and RUAs which carried out or asked years of schooling in one survey only were excluded from this analysis.

To identify the RUAs where there is possibly a bias in the cohort trend, either through a measurement bias or a change in the population which the sample represents, a Cohort Trend Score (CTS) was defined. The score is based in the estimated changes and their standard errors for men and women in the common age groups 35-44 and 45-54, in three steps:

  1. The average change (A) was calculated for each sex/birth cohort. These are shown at the end of Table 8. This average change was used as the reference value around which the random variation was expected to occur.
  2. Upper and lower limits were set for a change as A ± 2.5 SE, where SE is the standard error of the estimated change. If a change is normally distributed with mean A and variance SE2, then the probability that all four changes (two birth cohorts and two sexes) are within the limits is 95%. The observed changes which are outside the limit have been marked with an asterisk (*) in Table 8.
  3. The Cohort Trend Score (CTS) is defined as:
CTS = 2 if all four changes are within limits;
1 if one of the four changes is out of limits;
0 if at least two of the four changes are out of limits.

If the score is 2, there is no evidence of bias between the surveys. If the score is 1, there may be a bias, at least concerning the representativeness of the sample in some sex/birth cohort. A score 0 is a signal of a more general bias.

The score was 0 in three RUAs (CHN-BEI, RUS-NOI and SPA-CAT) between the initial and middle survey, in four RUAs (RUS-NOCa, RUS-NOI, SPA-CAT and UNK-BEL) between the middle and final survey, and in seven RUAs (FRA-LIL, FRA-STR, GER-EGEc, RUS-MOIa, RUS-NOCa, RUS-NOI and SPA-CAT) between the initial and final survey. In addition, in two RUAs (BEL-CHA and ICE-ICE) there were significant changes in the youngest birth cohorts which were not taken into account in the score. The MCCs in question were asked to try to find out whether the observed changes were due to measurement bias or due to changes in the representativeness of the respondents.

5. How to use data on education in data analysis?

The only social variables in the MONICA population survey core data are, thus, education and marital status. While marital status can be used as a crude indicator of an individual's "social integration" or "social isolation", education can be viewed as a marker of the socio-economic structure of the population and as an indicator of an individual's social status. The usefulness of the data on marital status is very limited in MONICA, because in the age groups considered nearly everybody is married or cohabitating in most RUAs. In MONICA the data on education will be needed when describing socio-economic differences in the risk factor levels, their trends and associations between the risk factors, and as a covariate in analyses of the risk factors.

In the MONICA data there are two variables on education: level of education and years of schooling. The data on level of education is difficult to use for most purposes because of major differences in the distributions of the item between the RUAs. Such differences are mainly due to different educational systems in the countries and apparent differences in the interpretation of the different levels. For trend analyses there is a further complication caused by changes in the educational system in many countries. Similar problems concern the use of years of schooling as a continuous variable or, if categorized, using fixed cut points.

These problems were already discussed in the quality assessment of the initial (4) and middle survey (5). To overcome these problems a solution based on three categories of years of schooling was introduced in the middle survey quality assessment. The categories are defined so that they are approximately equal in size within each survey in each RUA. The following principles of grouping the years of schooling into three categories was recommended:

  1. The cut points are defined separately for each sex/10-year age group for each survey in each RUA. In this way the categories become adjusted for the differences in educational systems between the RUAs and for changes in the educational systems within the RUAs.
  2. Each of the three categories should have approximately one third of the respondents of the sex/10-year age group of the survey in the RUA. In this way we can guarantee that we have a reasonable number of observations in the categories in all RUAs.
  3. The cut points should define a unique categorization of the subjects. This can be done by defining the cut points as decimals of years (e.g. 9.5) rather than integers.

Applying these principles, the cut points were chosen between whole years of schooling in such a way that the proportions in the two extreme categories were as close as possible to one third. However, due to the clumping together of the distributions of years of schooling, these cut points were changed, if necessary, to ensure that each of the two extreme categories have at least 15% of the subjects. The categories of years of schooling, separately for each of the three surveys, by 10-year age group in each RUA are presented in Table 9. In some RUAs the clumping of the distributions was so strong that only the two extreme categories could be identified, leaving the middle category empty (e.g. age group 55-64 in GER-EGEc, GER-HAC, GER-KMSd and ITA-BRI). In the two oldest age groups among women in GER-RDMf the cut points could not be determined so that the two extreme categories would have contained at least 15% of the subjects.

An alternative to looking at trends in risk factors within age groups is to estimate trends within groups of birth cohorts. This approach was first applied in the analysis of smoking trends (3), where the birth cohorts were defined by 5-year groups of years of birth. The years of birth were defined for each RUA in such a way that they corresponded approximately to the 5-year age groups in the initial survey. Estimation of trends in birth cohorts is particularly suitable when the analysis is stratified by categories of education, because the distribution of the years of schooling is expected to be essentially unchanged within the cohorts between the surveys. Therefore, in such analyses, the cut points can be chosen from pooled data from the three surveys.

Table 10 shows for each RUA the years of birth of the 5-year cohorts and the cut points for the years of schooling. Data from all available surveys were pooled by birth cohort to define the cut points for years of schooling. For CHN-BEI, FRA-LIL, GER-BER, RUS-MOC and RUS-MOIa the years of birth of the cohorts were not exactly the same for men and women due to differences in the mean years of examination by sex in the initial survey. In the oldest birth cohort among men in GER-RDMf the cut points could not be determined so that the two extreme categories would have contained at least 15% of the subjects.

6. Summary and recommendations

In the initial survey, all but 9 RUAs (CAN-HAL, GER-COT, GER-EGEd, GER-ERF, GER-KMSc, GER-RDMe, HUN-BUD, HUN-PEC and SWE-GOT) collected data on marital status. Data on educational level was collected in all but 9 RUAs (GER-COT, GER-EGEd, GER-ERF, GER-KMSc, GER-RDMe, HUN-BUD, HUN-PEC, NEZ-AUC and SWE-GOT) and data on years of schooling in all except 13 RUAs (AUS-NEW, AUS-PERa, CAN-HAL, GER-COT, GER-EGEd, GER-ERF, GER-KMSc, GER-RDMe, HUN-BUD, HUN-PEC, LTU-KAU, NEZ-AUC and SWE-GOT). In the middle survey, all RUAs except one (HUN-PECa) collected data on marital status and level of education, and all except four (AUS-NEW, AUS-PERa, AUS-PERb, HUN-PEC) on years of schooling. In the final survey, all RUAs collected data on marital status and educational level and all but four (AUS-PERa, AUS-PERb, GER-ERF, NEZ-AUC) on years of schooling. In addition, data on marital status and years of schooling are not available for one RUA (GER-RHN) in the initial survey.

Data concerning marital status and education were very complete. Only one RUA (ITA-LAT 16%) had a high proportion of missing data on marital status and educational level and two RUAs (ITA-LATa 66% and RUS-MOIb 29%) on years of schooling in the initial survey. Large changes in the proportion of missing data between the surveys were also uncommon.

There were no major problems concerning the comparability of data on marital status between the RUAs and between the surveys within the RUAs. The usefulness of the data on marital status is, however, limited in MONICA because in the age groups considered nearly everybody is married or cohabitating in most RUAs. The data on education are more useful but also more problematic. The question on the highest completed level of education has major problems of comparability, both between RUAs within surveys and between surveys within RUAs. Problems of comparability between RUAs are mostly due to differences in educational systems, in interpretation of the different levels and/or differences in coding procedures. Problems of comparability between the surveys are induced by changes in the questionnaires and coding procedures. Furthermore, changes in educational systems induce changes in the distributions of level of education between birth cohorts.

The question on years of schooling has less problems of comparability between RUAs and between the surveys within RUAs than the question on educational level, because there were no major differences in the question between RUAs and no major changes in the question and coding procedures between the surveys within RUAs. However, even this item is not completely problem free. All RUAs did not explicitly use the term "full time" study in their questionnaire, and there may be differences between RUAs in what is considered as full time study, for example whether professional training was included or not. In addition, even though the distribution of years of schooling is expected to be unchanged within birth cohorts, such changes were observed in several RUAs. The MCCs concerned were asked to carefully review the possible reasons for the changes in birth cohorts.

The most satisfactory approach developed to deal with education in data analysis in MONICA is using the grouping of years of schooling into three approximately equal categories within each RUA. The categories of years of schooling can be defined in two ways: either based on each sex/10-year age group or on each sex/birth cohort. The age group based categories of years of schooling can be used both in cross-sectional and trend analyses and are calculated separately for each survey. The birth cohort based categories are suitable for trend analysis, especially when the trends are calculated within birth cohorts and stratified by education since pooled data from all surveys can be used to define the cut points for the categories. When applying this approach using the relative grouping of years of schooling, either based on age groups or on birth cohorts, most of the problems of comparability can be overcome because the categories become adjusted for the differences in the educational systems between RUAs and for the changes in the educational systems within the RUAs.

The relative grouping of years of schooling should be used in collaborative MONICA analyses involving several RUAs. In analyses involving individual RUAs or countries the use of EDLEVEL may also be appropriate. In analyses applying the relative grouping of years of schooling the RUAs which did not collect data on item SCHOOL (AUS-NEW, AUS-PERa, CAN-HAL, GER-COT, GER-EGEd, GER-ERF, GER-KMSc, GER-RDMe, HUN-BUD, HUN-PEC, LTU-KAU, NEZ-AUC and SWE-GOT in the initial survey, AUS-NEW, AUS-PERa, AUS-PERb, HUN-PEC in the middle survey, and AUS-PERa, AUS-PERb, GER-ERF, NEZ-AUC in the final survey) cannot be included and the exclusion of RUAs which had more than 20% of missing data for the item (ITA-LATa and RUS-MOIb in the initial survey) is strongly recommended.

7. Comments on individual RUAs

The following list includes only the RUAs with specific findings or exceptional background information relevant to the use of data.

AUS-NEW

AUS-PERa

AUS-PERb

BEL-CHA

BEL-GHE

BEL-LUX

CAN-HAL

CHN-BEI

CZE-CZE

DEN-GLO

FIN-KUO

FIN-NKA

FIN-TUL

FRA-LIL

FRA-STR

FRA-TOU

GER-AUR

GER-AUU

GER-BER

GER-BREa

GER-BREb

GER-COT

GER-EGEc

GER-EGEd

GER-ERF

GER-HAC

GER-KMSc

GER-KMSd

GER-RDMe

GER-RDMf

GER-RHN

HUN-BUD

HUN-PEC

ICE-ICE

ISR-TEL

ITA-BRI

ITA-FRI

ITA-LAT

LTU-KAU

NEZ-AUC

POL-TAR

POL-WAR

RUS-MOC

RUS-MOIa

RUS-MOIb

RUS-NOCa

RUS-NOCb

RUS-NOI

SPA-CAT

SWE-GOT

SWE-NSW

SWI-TIC

SWI-VAF

UNK-BEL

UNK-GLA

USA-STA

YUG-NOS

8. References to publications

  1. Kuulasmaa K, Tolonen H, Ferrario M, Ruokokoski E for the WHO MONICA Project. Age, date of examination and survey periods in the MONICA surveys. (May 1998). Available from: URL: http://www.ktl.fi/publications/monica/age/ageqa.htm, URN:NBN:fi-fe19991075
  2. WHO MONICA Project. MONICA Manual. Part III: Population Survey. Section 1: Population survey data component. (December 1997). Available from: URL: http://www.ktl.fi/publications/monica/manual/part3/iii-1.htm , URN:NBN:fi-fe19981151.
  3. Dobson A, Kuulasmaa K, Moltchanov V, Evans A, Fortmann SP, Jamrozik K, Sans S, Tuomilehto J for the WHO MONICA Project. Changes in cigarette smoking among adults in 35 populations in the mid 1980s. Tobacco Control 1998,7:14-21.

9. References to internal MONICA documents

  1. Härtel U, Kuulasmaa K, Schneider A, Koivisto AM for the WHO MONICA Project. Quality assessment of data on marital status and educational achievement in the first survey of the WHO MONICA Project. MONICA Memo 253A, January 1994.
  2. Molarius A, Kuulasmaa K, Moltchanov V for the WHO MONICA Project. Quality assessment of data on marital status and educational achievement in the second survey of the WHO MONICA Project. MONICA Memo 301A, May 1996.