CDISC data standards
CDISC (Clinical Data Interchange Standards Consortium) defines the data formats required for regulatory submissions to the FDA and EMA. As a pharmacometrician, you will routinely work with SDTM and ADaM datasets — understanding their structure is essential for efficient data preparation and analysis.
The data pipeline
Data flows from the clinic through a series of transformations before reaching your analysis (Figure 1).
Each standard has an Implementation Guide (IG) (like SDTMIG for SDTM) with version numbers that must be tracked and declared in submissions. CDISC also provides TAUGs (Therapeutic Area User Guides) that give domain-specific guidance (like for oncology, diabetes, CNS, etc.).
CDASH
CDASH (Clinical Data Acquisition Standards Harmonization) standardizes data entry on case report forms (CRFs) at clinical sites. It defines the fields that should be collected, ensuring consistency before data is ever entered into a database. As a pharmacometrician, you rarely work directly with CDASH — but it determines the quality and completeness of the raw data you eventually receive.
SDTM
SDTM (Study Data Tabulation Model) organizes raw clinical data into submission-ready datasets. It defines domains (separate datasets on specific topics, like lab values, vital signs, or adverse events), along with controlled terminology for naming variables.
Domains most relevant to pharmacometricians
| Domain | Name | What it contains |
|---|---|---|
PC |
PK concentrations | Drug and metabolite concentrations over time — your primary dataset for popPK |
PP |
PK parameters | NCA-derived parameters (AUC, Cmax, t½) per subject per period |
EX |
Exposure | Dosing records — dose amount, route, start/end date and time |
LB |
Laboratory results | Lab values used as covariates (renal/hepatic function, biomarkers) |
DM |
Demographics | Age, sex, race, weight — the covariate backbone |
VS |
Vital signs | Height, weight, blood pressure — often used as time-varying covariates |
AE |
Adverse events | Safety data — relevant for E-R safety analyses |
General Observation Classes
All SDTM domains are grouped into observation classes:
| Observation Class | Domain Name | Domain Abbreviation |
|---|---|---|
| Interventions | Procedure agents | AG |
| Concomitant/prior medications | CM |
|
| Exposure | EX |
|
| Exposure as collected | EC |
|
| Meal data | ML |
|
| Procedures | PR |
|
| Substance use | SU |
|
| Events | Adverse events | AE |
| Biospecimen events | BE |
|
| Clinical events | CE |
|
| Disposition | DS |
|
| Healthcare encounters | HO |
|
| Medical history | MH |
|
| Protocol deviations | DV |
|
| Findings | Product/drug accountability | DA |
| Death details | DD |
|
| ECG test results | EG |
|
| Inclusion/Exclusion criteria not met | IE |
|
| Biospecimen findings | BS |
|
| Cell phenotype findings | CP |
|
| Genomics findings | GF |
|
| Immunogenicity specimen assessments | IS |
|
| Laboratory test results | LB |
|
| Microbiology specimen | MB |
|
| Microbiology susceptibility | MS |
|
| Microscopic findings | MI |
|
| PK concentrations | PC |
|
| PK parameters | PP |
|
| Morphology | MO |
|
| Cardiovascular system findings | CV |
|
| Musculoskeletal system findings | MK |
|
| Nervous system findings | NV |
|
| Ophthalmic examinations | OE |
|
| Reproductive system findings | RP |
|
| Respiratory system findings | RE |
|
| Urinary system findings | UR |
|
| Physical examination | PE |
|
| Functional tests | FT |
|
| Questionnaires | QS |
|
| Disease response and clinical classification | RS |
|
| Subject characteristics | SC |
|
| Subject status | SS |
|
| Tumor/lesion identification | TU |
|
| Tumor/lesion results | TR |
|
| Findings about | Vital signs | VS |
| Findings about events or interventions | FA |
|
| Skin response | SR |
|
| Special Purpose | Comments | CO |
| Demographics | DM |
|
| Subject elements | SE |
|
| Subject disease milestones | SM |
|
| Subject visits | SV |
|
| Trial design | Trial arms | TA |
| Trial disease assessments | TD |
|
| Trial elements | TE |
|
| Trial inclusion/exclusion criteria | TI |
|
| Trial disease milestones | TM |
|
| Trial summary | TS |
|
| Trial visits | TV |
|
| Relationship | Related records | RELREC |
| Related specimens | RELSPEC |
|
| Related subjects | RELSUB |
|
| Supplemental qualifiers for [domain name] | SUPP-- |
|
| Study reference | Non-host organism indentifiers | OI |
PC domain — PK concentrations
The PC domain (along with the EX domain) is the most important SDTM domain for pharmacometricians. Each record is one concentration measurement for one subject at one time point.
Key variables:
| Variable | Description |
|---|---|
USUBJID |
Unique subject identifier |
PCTESTCD |
Analyte code (e.g., DRUG, METABOLITE) |
PCORRES |
Result as collected (character, e.g., <0.5) |
PCSTRESC |
Standardized result (character; BLQ if below LLOQ) |
PCSTRESN |
Standardized result (numeric; blank if BLQ) |
PCSTRESU |
Standardized units (e.g., ng/mL) |
PCLLOQ |
Lower limit of quantitation |
PCDTC |
Date/time of collection (ISO 8601) |
SDTM variable types
Variables in each domain are classified as required, expected, or permissible, and fall into four types:
| Type | Purpose | Examples |
|---|---|---|
| Identifier | Links records to study/subject | STUDYID, USUBJID, --SEQ |
| Topic | The focus of the observation | --TESTCD, --TRT, --TERM |
| Qualifier | Additional detail about the topic | Result, grouping, synonym, record, variable qualifiers |
| Timing | When the observation occurred | --DTC, --DY, VISIT, --ELTM |
ADaM
ADaM (Analysis Data Model) derives analysis-ready datasets from SDTM. Like SDTM, it is organized into different domains, each with its specific setup. Most of the time, an ADaM domain is derived from an SDTM domain (e.g., ADPC from PC) but with additional variables and transformations to facilitate analysis.
ADaM dataset structures
| Structure | Name | Description |
|---|---|---|
| ADSL | Subject-Level Analysis Dataset | One record per subject — all baseline covariates and flags |
| BDS | Basic Data Structure | One or more records per subject/parameter/timepoint — used for ADPC, ADPP |
| OCCDS | Occurrence Data Structure | Designed for counting occurrences (adverse events, medications) |
ADaM variable naming conventions
All ADaM variable names must be no more than 8 characters in length, start with a letter (not underscore), and be composed only of letters (A–Z), underscore (_), and numerals (0–9). ADaM adheres to a principle of harmonization known as “same name, same meaning, same values” across datasets.
In a pair of corresponding variables (e.g., TRTP and TRTPN), the primary or most commonly used variable does not have the suffix or extension (i.e., N for numeric or C for character). The relevant suffix is used only on the name of the secondary member of the variable pair.
| Suffix | Meaning |
|---|---|
N |
Numeric version of a character variable (e.g., TRTPN) |
C |
Character version of a numeric variable |
FL |
Flag/indicator variable (Y/N, 1/0) |
GRy, Gy, CATy |
Grouping variable (y = grouping scheme) |
DT |
Numeric date |
DTM |
Numeric datetime |
TM |
Numeric time |
DTF |
Date imputation flag |
TMF |
Time imputation flag |
DY |
Relative study day (no day 0) |
BL |
Baseline |
CHG |
Change |
FU |
Follow-up |
OT |
On treatment |
RU |
Run-in |
SC |
Screening |
TA |
Taper |
TI |
Titer |
U |
Units |
WA |
Washout |
SEND
SEND (Standard for Exchange of Nonclinical Data) is the nonclinical counterpart to SDTM — it applies the same domain-based structure to pre-clinical studies such as toxicology and pharmacokinetic studies in vivo.
SEND data can be a source for allometric scaling, preclinical PK/PD model development, and inter-species translation.
