logo

Baptiste Couvy-Duchesne, Lachlan T. Strike, Futao Zhang, Yan Holtz, Zhili Zheng, Kathryn E. Kemper, Loic Yengo, Olivier Colliot, Margaret J. Wright, Naomi R. Wray, Jian Yang, Peter M. Visscher

The code is presented in a serie of Rmarkdown files with corresponding html knitted files that compose this website. It presents the code allowing the perform all analyses and generate plots presented in the manuscript.

1 Dependencies

qsubshcom
Sections of the code in bash may be run directly on a high performance cluster thanks to the qsubshcom piece of software. (https://github.com/zhilizheng/qsubshcom)
In short, qsubshcom is a job submission wrapper that adapts automatically to the cluster system, which makes our code highly portable: you only need to set up a few paths (see below) and run on you cluster.

Steps that do not require much time/memory were sometimes run in an interactive session (they come without the qsubshcom syntax)

FreeSurfer 6.0
Image processing requires FreeSurfer 6.0 (https://surfer.nmr.mgh.harvard.edu/).

ENIGMA-shape
Subcortical processing further requires the scripts developped by Boris Gutman for the ENIGMA consortium (http://enigma.ini.usc.edu/protocols/imaging-protocols/).

OSCA
Most of the mixed model analyses were performed using the OSCA software (https://cnsgenomics.com/software/osca/#Overview).

GCTA
For bivariate models, the option is yet to be implemented in OSCA. We used a trick to run those models in GCTA (https://cnsgenomics.com/software/gcta/#BivariateGREMLanalysis).

R
Plots, some data management and penalised regression (LASSO) were performed using R version 3.3 to 3.6. See manuscript (or code) for full list of packages used.

2 Atlas files

We created a bunch of files to facilitate the analysis. They may be used to prune out vertices outside of the cortex or to relate each vertex to a cortical/subcortical region. They are available in the atlas folder of this repository.

3 General organisation of the working directory ${wd}

    /FS 

Contains all outputs from MRI image processing, in particular:

    /FS                             
        /batch${batch}              
            /${ID}
                /mri
                /surf
                /ENIGMA_shape
                /...

Output of FreeSurfer (standard) and ENIGMA-shape processing

    /FS         
        ENIGMAshapeResults          
            /batch${batch}
                /${ID}

Reorganised copy of the ENIGMA_shape folders, to facilitate data extraction using ENIGMA-shape function

    /FS
        FSresults
            /batch${batch}
                /${ID}                  

Selected output FreeSurfer files (folders stats and part of the surf) which are necessary for the downstream analyses. Having all useful files in ENIGMAshapeResults and FSresults allows deleting all other FreeSurfer files to reduce memory and file usage on the cluster.

    /BodFiles                       

Binary files for OSCA analysis, incl. vertex files and brain relatedness matrices

    /Phenotypes_15K
        /UKB_phenotypes
        /UKB_phenotypes_reg_Bsln
        /UKB_phenotypes_reg_Bsln_bdsz

Phenotype files to facilitate loops on variables included in the analyses
File format (tsv or space delimited): eid eid pheno, no column names
The different folders correspond to different covariates that have been regressed out (Bsln=baseline model/covariates, Bsln_bdsz= baseline + body size covariates)

    /Phenotypes_15K
        /UKB_phenotypes_bivariate_reg_Bsln
        /UKB_phenotypes_bivariate_reg_Bsln_bdsz

Each file contains a pair of phenotypes for bivariate models (eid eid pheno1 pheno2)

    /BREML_all_QC_reg_Bsln
    /BREML_all_QC_reg_Bsln_bdsz

Results of morphometricity analyses (fsaverage no smoothing) with different sets of covariates

    /BREML_ROI_Bsln
        /lingual
        /Hippocampus
        /...
    /BREML_ROI_Bsln_bdsz
        /lingual
        /Hippocampus
        /...

Results of morphometricity analyses by Regions of Interest (ROIs).

    /BREML_bivariate_reg_Bsln_bdsz
    /BREML_bivariate_reg_Bsln

Results of grey-matter correlations (bivariate models)

    /BLUP_Bsln
    /BLUP_Bsln_bdsz

Files from BLUP calculation on discovery sample, incl. individual weights, blup weights and BLUP scores from cross-validation design.

    /LassoRidgePred

Outputs from Lasso prediction using R package bigstatsr

    /BREML_Bsln_fwhm${fwhm}
    /BREML_Bsln_${fsav}

Results of morphometricity analyses varying smoothing of the fsaverage cortical meshes (fwhm in 5 10 15 20 25) and the mesh complexity (fsav in fsaverage3 fsaverage4 fsaverage5 fsaverage6)
Baseline covariates

4 Aliases used in the scripts

${od} Output directory

${fsdir} FreeSurfer directory

    /bin
    /data
    /subjects
    /...

${bind} Binaries directory

    /osca
    /gcta
    /qsubshcom
    /ENIGMA_shape
        /MedialDemonsShared
        /...
    

${medici} Storage directory where MRI images are stored/downloaded

    /T1
        /raw
            /batch${batch}
                /T1_fetch_batch${batch}.lis
                /...    
        /unzip
            /batch${batch}
    
    /T2
        /raw
            /batch${batch}
                /T2_fetch_batch${batch}.lis
                /...
        /unzip
            /batch${batch}      
        
 




A work by by [Baptiste Couvy-Duchesne] - 23 April 2020