CDISC data standards

Under construction

CDISC stands for Clinical Data Interchange Standards Consortium. CDISC provides several interrelated standards, each addressing a specific stage of the data lifecycle . Each standard has an Implementation Guide (IG) that shows exactly how to apply that standard practically, e.g. SDTMIG for SDTM. All the implementations have version numbers that are important to keep track of.

CDISC also provides TAUGs (Therapeutic Area User Guides). A TAUG is a document providing guidance on applying CDISC standards to a specific therapeutic area (oncology, diabetes, CNS, etc.).

Figure 1

CDASH

CDASH (Clinical Data Acquisition Standards Harmonization) standardizes how data are entered into case report forms (CRFs) at clinical sites.

SDTM

SDTM (Study Data Tabulation Model) organizes the data from CDASH into submission-ready datasets. It includes different domains/datasets (e.g., exposure, and lab-values) and controlled terminology (e.g., how to name variables).

General Observation Classes

General Observation Classes are high-level classifications within SDTM that group domains based on the type of data they contain.

There are three general observation classes: Interventions, Events, and Findings. There is also a fourth observation class: Special purpose.

Observation classes are then grouped into domains, organized around a particular topic. Each “domain” is usually a separate dataset; for example, EX can be a dataset containing exposure (dosing) data. However, a domain can also be split into multiple datasets. A typical example is splitting the Laboratory Test Results (LB) domain due to size.

Table 1: General observation classes in SDTM.
Observation Class Domain Name Domain Abbreviation
Interventions Procedure agents AG
Concomitant/prior medications CM
Exposure EX
Exposure as collected EC
Meal data ML
Procedures PR
Substance use SU
Events Adverse events AE
Biospecimen events BE
Clinical events CE
Disposition DS
Healthcare encounters HO
Medical history MH
Protocol deviations DV
Findings Product/drug accountability DA
Death details DD
ECG test results EG
Inclusion/Exclusion criteria not met IE
Biospecimen findings BS
Cell phenotype findings CP
Genomics findings GF
Immunogenicity specimen assessments IS
Laboratory test results LB
Microbiology specimen MB
Microbiology susceptibility MS
Microscopic findings MI
PK concentrations PC
PK parameters PP
Morphology MO
Cardiovascular system findings CV
Musculoskeletal system findings MK
Nervous system findings NV
Ophthalmic examinations OE
Reproductive system findings RP
Respiratory system findings RE
Urinary system findings UR
Physical examination PE
Functional tests FT
Questionnaires QS
Disease response and clinical classification RS
Subject characteristics SC
Subject status SS
Tumor/lesion identification TU
Tumor/lesion results TR
Vital signs VS
Findings about events or interventions FA
Skin response SR
Special Purpose Comments CO
Demographics DM
Subject elements SE
Subject disease milestones SM
Subject visits SV

SDTM variables

Variables are classified as required, expected, or permissible for each domain.

Identifier

  • STUDYID
  • DOMAIN
  • USUBJID
  • –SEQ

Topic

Contain the focus of the observation.

  • –TRT
  • –TERM
  • –TESTCD

Qualifier

Qualifiers provide additional information about the topic variables.

  • Grouping
  • Result
  • Synonym
  • Record
  • Variable

Timing

Describes the timing of the observation.

  • Date/times
  • Study days
  • Durations
  • Intervals
  • Visits
  • Time points
  • Relative times

ADaM

ADaM (Analysis Data Model) takes SDTM data and transforms it into datasets suitable for statistical analysis. It contains derived datasets used for tables, listings, figures. Like SDTM it is organized into different domains/datasets, each with its specific setup.

https://www.cdisc.org/system/files/members/standard/foundational/ADaMIG_v1.3.pdf

ADaM Standard data structures

  • ADSL: Subject-level analysis dataset
    • One record per subject
  • BDS: Basic data structure
    • One (or more) record per subject/parameter/timepoint
  • OCCDS: Occurrence data structure
    • Designed for counting occurrences
  • ADaM Other data structure

ADaM variable conventions

General variable conventions

All ADaM variable names must be no more than 8 characters in length, start with a letter (not underscore), and be composed only of letters (A–Z), underscore ( _ ), and numerals (0–9).

ADaM adheres to a principle of harmonization known as “same name, same meaning, same values.”

Generally, suppose an SDTM character variable is converted to a numeric variable in an ADaM dataset. In that case, it should be named as it is in the SDTM dataset with an “N” suffix added.

In a pair of corresponding variables (e.g., TRTP and TRTPN), the primary or most commonly used variable does not have the suffix or extension (i.e., N for numeric or C for character). The relevant suffix is used only on the name of the secondary member of the variable pair.

Variables whose names end in FL are character flag (or indicator) variables.

Variables whose names end in GRy, Gy, or CATy are grouping variables, where “y” refers to the grouping scheme or algorithm (not the category within the grouping).

It is recommended that producer-defined grouping or categorization variables begin with the name of the variable being grouped and end in GRy (e.g., variable ABCGRy is a character description of a grouping or categorization of the values from the ABC variable for analysis purposes).

Timing variable conventions

Variables whose names end in DT are numeric dates.

Variables whose names end in DTM are numeric datetimes.

Variables whose names end in TM are numeric times.

Names of timing start variables end with an S followed by the characters indicating the type of timing (i.e., SDT, STM, SDTM).

Names of timing end variables end with an E followed by the characters indicating the type of timing (i.e., EDT, ETM, EDTM)

Variables whose names end in DY are relative day variables.

Variable naming fragments

Fragment Notes
GRy Grouping variables
Gy Abbreviation of GRy
FL Flag/indicator
DT Numeric date
TM Numeric time
DTM Numeric datetime
DTF Date imputation flag
TMF Time imputation flag
DY Relative days (does not include day 0)
Fragment Notes
BL Baseline
CHG Change
FU Follow-up
OT On treatment
RU Run-in
SC Screening
TA Taper
TI Titer
U Units
WA Washout

ADSL variables

Variable Notes
Variable Notes

BDS variables

Analysis-enabling variables

SEND

SEND (Standard for Exchange of Nonclinical Data) mirrors SDTM but for non-clinical (animal) studies, describing how data is organized.