Reproducible Research

Published

September 14, 2025

CautionUnder construction

Reproducibility and Replicability in Research

Use version control (git or svn)

  • Do not track model development in git, it is too messy, which messes with the git history.
    • Use rsync if needed
    • Track the Rmd-file for the report
    • This tracks the models. The models are still in the “messy” folder.
      • base_model <- run25.mod
      • covariate_model <- run63.mod
      • final_model <- run67.mod
      • simulation_model <- run68.mod
    • Runrecord
      • runno
      • based on
      • OFV
      • dOFV
      • Condition number (CN)
  • Do not track produced PDFs in git

When and how to commit

Commit often and in small, contained chunks.

The seven rules of a great Git commit message

  1. Separate subject from body with a blank line
  2. Limit the subject line to 50 characters
  3. Capitalize the subject line
  4. Do not end the subject line with a period
  5. Use the imperative mood in the subject line. Git itself uses the imperative whenever it creates a commit on your behalf. A properly formed Git commit subject line should always be able to complete the following sentence: If applied, this commit will your subject line here
  6. Wrap the body at 72 characters
  7. Use the body to explain what and why vs. how

File/folder naming

  • Name files/folders using only A-Z, a-z, 0-9, -, _.
    • Start folder names with a number for sorting purposes.
  • In general, use kebab-case for naming (easier to read than snake_case).
    • If there are multiple parts to a name (e.g., a description, a date, and an author), use snake_case to separate between parts, and kebab-case within the parts (e.g., descriptive-name_2025-01-08_viktor-rognas.ext)

Folder structure:

project/
  - README.md       # Project description
  - input/
    - data/         # All input data files
      - raw_data/   # Untouched original data files
        - raw_data.csv
      - dat1.csv
      - dat2.csv
  - R/              # R-scripts
      - dat1.R
      - dat2.R
  - NONMEM/
    - model/        # Model files
      - pk/
        - run001.mod
      - pd/
        - run002.mod
  - output/         # Results
    - report/
      - 1a/
        - .tex
        - .pdf
      - 1b/
        - .tex
        - .pdf
      - 1/
        - .tex
        - .pdf
    - presentation/ # Communication
      - slides.pptx

Coding: language specific

R

  • Script all plots.
  • Quarto-scripted report.
    • R.version
    • rstudioapi::versionInfo()
    • .packages()
    • devtools::session_info(pkgs = "attached")

NONMEM

When using Monte-Carlo estimation methods (e.g., SAEM, IMP, or FOCE MCETA)), always specify the SEED option and RANMETHOD=P. Also, it is recommended to specify the RANMETHOD option accordingy: * For SAEM and IMP: RANMETHOD=3S2P * For MCETA:RANMETHOD=4P($SIMULATION` uses this method by default)

Footnotes

  1. https://cbea.ms/git-commit/↩︎