MORGAM logo

Data transfer format: Case and subcohort selection data

  • Form: 65
  • Version: 2
  • Date: 2007-05-21

Valid HTML 4.01!
© National Institute for Health and Welfare and the MORGAM Project investigators
Last updated: 18 November 2008
For more information, please contact Zygimantas Cepaitis (firstname.lastname@thl.fi) or Kari Kuulasmaa (firstname.lastname@thl.fi)

The purpose of this form is to provide a format for the transfer of data from the selection of cases and subcohort to MORGAM database. Data in the format specified here are provided for every member of each MORGAM cohort for which the selection of cases and the subcohort has been done. The MORGAM case-cohort design is described in sections "Selection of cases and cohort subsample" and "Subsample selection after extension of follow-up" of the Manual.

This format should not be used for transferring data from the MORGAM Participating Centres to the MORGAM Data Centre (MDC), because these data are generated by the MDC.

Contents


Format specification

ITEM NAME SPECIFICATION AND CODES FORMAT OR VALUE

Form identification:

FORM Form identification I2 |6|5|
VERSN Form version I1 |1|

General information on selection - these items have the same value for all records in this transfer set:

SELECTION Case and subcohort selection id number C3 |_|_|_|
PHASE Phase of selection of cases and subsample in the cohort
01 = first
02 = second
etc.
C2 |_|_|
DATE Date when the selection of cases and cohort subsample was done (date ANSI) C8 |_|_|_|_||_|_||_|_|

Key 1:

CENTRE MORGAM Participating Centre C2 |_|_|
RUNIT MORGAM Reporting Unit C2 |_|_|
COHORT Cohort identification within the RUNIT
01 = MONICA baseline survey
02 = MONICA middle survey
03 = MONICA final survey
21, 22, ... other cohorts
C2 |_|_|
SERIAL Serial number C6 |_|_|_|_|_|_|

Case and subcohort selection data:

ELIGSC Eligibility of the person to the subcohort
1 = eligible
2 = ineligible
I1 |_|
PROB Selection probability of the person to the cohort subsample
(Sampling weights are derived from Item PROB and the case status, which depends on the definition of the end-point for each analysis.)
8 if ELIGSC = 2
R  
SUBCOH Was the person selected to the subcohort?
1 = yes
2 = no
8 if ELIGSC = 2
I1 |_|
CASEDTH Was the person selected to the case-cohort set because of death?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|
CASECHD Was the person selected to the case-cohort set because of CHD event during follow-up?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|
CASESTR Was the person selected to the case-cohort set because of stroke during follow-up?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|
CASETED Was the person selected to the case-cohort set because of a thrombo-embolic event during follow-up?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|
CASEAP Was the person selected to the case-cohort set because of angina pectoris during follow-up?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|
CASEBCHD Was the person selected to the case-cohort set because of CHD at baseline?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|
CASEBSTR Was the person selected to the case-cohort set because of stroke at baseline?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|
GENGROUP Genotyping group
1 = subcohort member or case during the follow-up who was healthy at baseline
2 = baseline case not in the subcohort
8 = not selected to the case-cohort
9 = not used
I1 |_|

 


Columns of the format specification

ITEM NAME
name used for the item in the MDC.
SPECIFICATION AND CODES
specification and values of the variable. More details can be found in the section "Definitions of the variables" below, or by following the hyperlink in column ITEM NAME.
FORMAT OR VALUE
specifies the format in which the value should be presented in the transfer data set. In the cases where the value of the variable is fixed, the value is also given in this column.
FORMAT Type Format Example Comments
C Character C7 SWE-NSWa RUA abbreviation used in MORGAM.
C2 03 Cohort identification
F Float F5.2 13.1 Variable includes a decimal point (.).
R Real number R 0.24927345 Selection probability.
I Integer I5 221
10323
 

Data transfer procedures

The data files shall be prepared in ASCII comma-delimited format using semicolon (;) as the delimiter, with the names of the variables in the first row.

Specific comments on each item

Follow these instructions carefully when creating the computer file for data transfer. Any exceptions to these coding rules must be documented in the Case and subcohort selection report in the internal MORGAM web site.

FORM Form identification I2 |6|5|

Number 65 indicates the "Data transfer format: Case and subcohort selection data".

VERSN Version of this form I1 |1|

This indicates the version number of this data transfer format entitled "Data transfer format: Case and subcohort selection data".

SELECTION Case and subcohort selection id number C3 |_|_|_|

This is a unique sequence number which identifies the selection of cases and subcohorts of this transfer set. The first ever selection has value 001, the second 002 etc. Gaps are not allowed in the numbering. For each selection, a report with the same SELECTION number has to be prepared in the internal MORGAM web site.

Item SELECTION must have the same value for every record in this data transfer set.

PHASE Phase of selection of cases and subsample in the cohort
01 = first
02 = second
etc.
C2 |_|_|

The selection of cases and subcohort can be supplemented later, for example due to extension of the follow-up period of a MORGAM cohort.

Each time when there are changes to the case-cohort selection for a RUNIT/COHORT combination, the PHASE value is increased to the next unused value within the cohorts being considered. Item PHASE must have the same value for every record in this data transfer set. (If, for any reason, the highest earlier PHASE-value varied between the cohorts considered, there will be gaps for the sequence of the PHASE values for some cohorts. This does not matter.)

DATE Date when the selection of cases and cohort subsample was done (date ANSI) C8 |_|_|_|_||_|_||_|_|

The first four numbers indicate the year, the next two the month and the last two the day of month.

Item DATE must have the same value for every record in this data transfer set.

CENTRE MORGAM Participating Centre C2 |_|_|
RUNIT MORGAM Reporting Unit C2 |_|_|
COHORT Cohort identification within the RUNIT
01 = MONICA baseline survey
02 = MONICA middle survey
03 = MONICA final survey
21, 22, ... other cohorts
C2 |_|_|
SERIAL Serial number I6 |_|_|_|_|_|_|

These are key items used for identifying the record and merging it with other records of the same individual.

CENTRE is the official MORGAM Participating Centre code number, RUNIT the official MORGAM Reporting Unit code number and COHORT the official MORGAM Cohort code number as they appear in section "MORGAM Participating Centres and cohorts" of the MORGAM Manual.

SERIAL is the identification for the individual. It is unique within the combination of CENTRE, RUNIT and COHORT.

ELIGSC Eligibility of the person to the subcohort
1 = eligible
2 = ineligible
I1 |_|

The eligibility of the person to the subcohort requires that

PROB Selection probability of the person to the cohort subsample
8 if ELIGSC = 2
R  

This is the selection probability for the individual to the subcohort, i.e. the marginal probability on which the person is in the cohort subsample. The sampling weights for data analysis in the case-cohort design are derived from item PROB and the case status, which depends on the definition of the end-point for each analysis.

SUBCOH Was the person selected to the subcohort?
1=yes
2=no
8 if ELIGSC = 2
I1 |_|

This item indicates whether the individual belongs to the subcohort or not.

CASEDTH Was the person selected to the case-cohort set because of death?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|

The general  selection criteria are:

Code 1 if the general selection criteria were met and the person was selected because of these.

Code 2 if the general selection criteria were not met.

Code 3 if the general selection criteria were met but the person was not selected because of these. The decision for not selecting the person must be described in the Case and subcohort selection report.

CASECHD Was the person selected to the case-cohort set because of CHD event during follow-up?
1 = yes
2 = no
I1 |_|

The general  selection criteria are:

Code 1 if the general selection criteria were met and the person was selected because of these.

Code 2 if the general selection criteria were not met.

Code 3 if the general selection criteria were met but the person was not selected because of these. The decision for not selecting the person must be described in the Case and subcohort selection report.

CASESTR Was the person selected to the case-cohort set because of stroke during follow-up?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|

The general  selection criteria are:

Code 1 if the general selection criteria were met and the person was selected because of these.

Code 2 if the general selection criteria were not met.

Code 3 if the general selection criteria were met but the person was not selected because of these. The decision for not selecting the person must be described in the Case and subcohort selection report.

CASETED Was the person selected to the case-cohort set because of a thrombo-embolic event during follow-up?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|

The general  selection criteria are:

Code 1 if the general selection criteria were met and the person was selected because of these.

Code 2 if the general selection criteria were not met.

Code 3 if the general selection criteria were met but the person was not selected because of these. The decision for not selecting the person must be described in the Case and subcohort selection report.

CASEAP Was the person selected to the case-cohort set because of angina pectoris during follow-up?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|

The general  selection criteria are:

Code 1 if the general selection criteria were met and the person was selected because of these.

Code 2 if the general selection criteria were not met.

Code 3 if the general selection criteria were met but the person was not selected because of these. The decision for not selecting the person must be described in the Case and subcohort selection report.

CASEBCHD Was the person selected to the case-cohort set because of CHD at baseline?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|

The general  selection criteria are:

Code 1 if the general selection criteria were met and the person was selected because of these.

Code 2 if the general selection criteria were not met.

Code 3 if the general selection criteria were met but the person was not selected because of these. The decision for not selecting the person must be described in the Case and subcohort selection report.

CASEBSTR Was the person selected to the case-cohort set because of stroke at baseline?
1 = yes
2 = no, because the the general selection criteria are not met
3 = no, because of a separate decision, although the selection criteria were met
I1 |_|

The general  selection criteria are:

Code 1 if the general selection criteria were met and the person was selected because of these.

Code 2 if the general selection criteria were not met.

Code 3 if the general selection criteria were met but the person was not selected because of these. The decision for not selecting the person must be described in the Case and subcohort selection report.

 

GENGROUP Genotyping group
1 = subcohort member or case during the follow-up who was healthy at baseline
2 = baseline case who is not in the subcohort
8 = not selected to the case-cohort set
9 = not used
I1 |_|

Updates

Date Update
21 May 2007 Item GENGROUP was added