![]() |
Description and quality of baseline data:
|
1 Department of Health Promotion and Chronic Disease Prevention, National Public Health Institute, Helsinki,
Finland
2 Since January 2007 at Indic Society for Education and Development
(INSEED), Nashik, India
3 See Annex for the sites and key personnel of
contributing MORGAM Centres
|
© National Institute for Health and Welfare
and the MORGAM Project investigators Last updated: 21 December 2007 For more information, please contact Kari Kuulasmaa (firstname.lastname@thl.fi) |
MORGAM collected data on smoking behaviour in the baseline examination. They were transferred to the MDC using the Data transfer format - MONICA survey data (Form 20):
For data analyses, four derived variables have been defined by the time of preparation of this document:
The data items considered in the quality assessment are those related to self reported status and quantity of current smoking: CIGS, NUMCIGS, CIGARSM, CIGAR, PIPESM and PIPE and derived items DSMOKER and SMOKER. Assessment of the other items will be added later.
The individual data items are assessed by:
This approach is based on the principles adopted from the comprehensive approach described in Quality Assessment of Data on Smoking Behaviour in the WHO MONICA Project [1].
The Insufficient Data Score (IDS) and Questionnaire Comparability Score (QCS) defined in the Introduction are used. In addition following scores are defined specifically for the smoking items:
a) Data Exctraction Score (DES) describing the procedure used in extraction of MORGAM items from the local data:
| DES = | 2 | if the local item matched the MORGAM item directly or the local data was converted to match the MORGAM definition using all relevant information, |
| 1 | if the local data was converted without making use of all relevant information, | |
| 0 | if there was an indication of a clear error in the conversion from the local data. |
a) Item Summary Score (ISS) is defined as ISS = min(QCS, DES, IDS).
Hyperlinks to the distributions of the data items are under the respective data item names:
The availability of data for the smoking items in each cohort is summarized in Table SM.1. Distributions with percentages for most of the data items and derived variables are given in the respective assessment sections below. Data on current and past cigarette smoking (CIGS, NUMCIGS, EVERCIG) are available for all Cohorts reflecting the MORGAM inclusion criterion for cohorts. Data on the year of cessation of cigarette smoking (STOP) are available for all Cohorts, except FIN-ATB, which represents the placebo cohort of the ATBC study where smoking was an inclusion criterion. The supplementary data for the item STOP (IFLYEAR) are not collected in Cohorts 01, 02 and 21 of DEN-GLO and in Cohort 21 of RUS-NOV and in UNK-CAE. The data on the age when regular cigarette smoking first started (CIGAGE) are missing in Cohorts 01 and 02 of AUS-NEW, DEN-GLO, FIN-EAS/WES and ITA-BRI, in Cohort 01 of POL-TAR, POL-WAR, RUS-NOV and SWE-NSW and in UNK-CAE.
Most of the centres have collected data on cigar/cigarillo smoking (CIGAR, CIGARSM) and pipe smoking (PIPE, PIPESM). Exceptions are ITA-FRI, ITA-FSE, ITA-PAM, ITA-ROM and LTU-KAU (Cohorts 01 and 02).
The data on serum thiocyanate and cotinine (SCN, COTIN) or expired air carbon monoxide (CARBMON), indicators of current smoking, are available only in some cohorts.
The derived variables can be calculated for nearly all cohorts. The exceptions are obvious from the availability of the original items above.
The data items reflecting the person's current smoking status are CIGS, NUMCIGS, CIGARSM, CIGAR, PIPESM and PIPE. For the time being, two additional variables, on daily cigarette smoking (DSMOKER) and all current smoking (SMOKER), have been derived from these. The questions used locally by the MPCs and the details of the conversion rules from the local question(s) to the MORGAM data items are given in Appendix 2. The comparability of the local items with the MORGAM definition is assessed for deriving the QCS. The distributions in Tables SM.3, SM.4 and SM.5 together with the possible additional information on the conversion procedure were used for deriving the DES. In cases where the value of QCS or DES was lower than the maximum value, the explanation is given in the centre specific comments below. The quality scores are presented in Table SM.2.
Data on CIGS and NUMCIGS are available for all Cohorts (Table SM.3). All local questionnaires include an item on daily quantity of smoking somewhat similar to NUMCIGS. There is more variation in the local questionnaires in comparison with CIGS. In some local questionnaires the current smoking status item does not separate between cigarette smoking and other tobacco smoking. Some Centres have reported that the smoking question did not specify cigarettes because smoking of other tobacco products is very rare in the country. Another variant is that the local question does not separate daily smokers from occasional smokers. In such cases it is not clear whether occasional smokers have reported being smokers or non-smokers.
For some cohorts, where the type of tobacco product is not specified in the local question for current smoking status, CIGS has been successfully derived from the question of smoking any tobacco product and the question on the number of cigarettes smoked per day. Potentially problematic for the conversion are the cases where NUMCIGS is missing, leaving the cigarette smoking status unresolved. In this case the logical option would be to code CIGS as missing. However, in some cases where CIGS = 1 and NUMCIGS = 999, the persons in question can be identified as cigar and pipe smokers who have left the daily number of cigarettes consumed unspecified instead of defining it as zero.
There is much more missing data in NUMCIGS than in CIGS. This is mostly due to a high percentage of missing data among occasional smokers, or among pipe or cigar smokers in the cohorts where CIGS included smokers of all tobacco products (Table SM.3). Missing data in NUMCIGS among daily cigarette smokers is usually relatively low, and this is the group of persons where NUMCIGS is most relevant.
Table SM.2 shows the quality scores for CIGS and NUMCIGS. The explanation for lower scores in QCS and DES are given in the Cohort specific comments below.
Known issues by Cohort:
Item CIGARSM was not included in the initial versions of the MONICA data transfer format and therefore in many of the initial MONICA cohorts the non-cigar smokers are coded as CIGARSM = 9 and CIGAR = 0, whereas in the later cohorts these should be coded as CIGARSM = 2 and CIGAR = 888. If cigar smoking status item similar to CIGARSM has not been included in the later questionnaires, this item may have been derived using the item on the weekly quantity of cigars consumed. It is possible that, in the conversion, cigarette smokers who have left the quantity of cigars smoked unspecified instead of specifying it as zero are also coded as non-cigar smokers. Large proportion of missing data in CIGARSM can indicate that this has not been done or that there is some other problem in the conversion. The distributions of CIGARSM and CIGAR can be found in Table SM.4. Because the item CIGARSM involves past as well as current cigar smoking, it is possible that as much as four different questionnaire items have been used in trying to derive CIGARSM, namely current (any kind of) tobacco smoking, past tobacco smoking and current and past weekly quantity of cigars consumed. As various methods may have been applied in the possible conversion, it is possible that data on CIGARSM is not directly comparable for centres which did not have questionnaire item similar to CIGARSM.
CIGAR is a more straightforward item than CIGARSM as it only involves current quantity of smoking. Also, the interpretation should be the same regardless of whether the local questionnaire included separate item on cigar smoking status. However, the interpretation is different for cohorts that have used a data transfer format without CIGARSM item. Some centres have asked about daily quantity of smoking instead of weekly quantity. This is not a serious concern, but it should be noted that the distribution of CIGAR is such case has values at the intervals of seven cigars.
Known issues by Cohort:
Everything that was said above on CIGARSM and CIGAR applies also for PIPESM and PIPE unless otherwise specified below. In addition, many centres have asked about number of pipes smoked rather than grams of pipe tobacco. This can be approximately converted into grams. The distributions of PIPESM and PIPE can be found in Table SM.5.
Known issues by Cohort (where different from cigar smoking items):
Two items are derived for analysis purposes describing current smoking. The distributions of these can be seen in Table SM.6. DSMOKER is derived from item CIGS simply by classifying occasional smokers as non-smokers. Therefore for the assessment of item DSMOKER we refer to the assessment of item CIGS above. Cohorts where the local questionnaire did not separate cigarette smoking from other tobacco smoking, some people may have been incorrectly coded as cigarette smokers. If the separation of cigarette smokers and other tobacco smokers is not important, this is not a serious problem.
A wider definition for current smoking is used in item SMOKER where also occasional cigarette smokers and cigar and pipe smokers have been included. The item SMOKER is less sensitive to potential problems in derivation of the original data items, because regular and occasional as well as different kind of tobacco smoking are pooled together and the item is only coded missing when all smoking status items (CIGS, CIGARSM and PIPESM) are missing. Overall the proportion of missing data in items DSMOKER and SMOKER is very low. However, in the cases where daily and occasional smoking were not separated in the local questionnaire, it is still uncertain whether occasional smokers have been regarded as smokers or non-smokers in the interview.
In most cohorts the local questionnaire matched exactly to the MORGAM definition. Most of the observed quality problems in the current cigarette smoking items concern cohorts in which the local questionnaire did not separate between cigarette smoking status from other types of tobacco smoking. In these cases the cigarette smoking status have been derived using information on the daily amount of cigarettes smoked, with varying success. The possible problems in this conversion are reflected in the cohort specific data extraction score. It should be noted that DES reflects mostly observed problems in the logic of the conversion. Whether the resulting problems in the data are significant from the analysis point of view depends on the number of individuals affected and should be evaluated separately for each analysis. Because the number of pipe or cigar smokers is in most cases small, the number of possibly incorrectly coded cigarette smokers is also expected to be small and therefore does not pose a significant problem for most analyses. In some cohorts the occasional smoking status was not separated from daily smoking. Also in these cases the number of misclassified subjects is likely to be small. However, the results of the quality assesment of the current smoking items raises the question whether is it necessary to try to exclude cigar and pipe smokers and occasional cigarette smokers from the smoking status item utilised in the analysis. The two currently derived variables indicating current smoking status reflect two possible extremes; item DSMOKER includes only daily cigarette smokers (and therefore is more susceptible to possible misclassification in the smoking items) while the item SMOKER includes all smokers. It is recommended to try both of these in the analysis to see how much the results differ.
| Date | Update |
|---|---|
| 2007-12-21 | First published version. |