GWAS_meta_regression

This repository contains the script that was used to perform a genome-wide associaton meta-regression as was used in https://www.biorxiv.org/content/10.1101/2020.04.24.031138v2. Please make sure that there are enough cohorts in the study to make a regression feasible.

Prerequisites

Cohort specific summary statistics for the phenotype that are ready for meta-analysis (QCed and filtered).
A summary file (see below for an example) that contains the necessary information per cohort.

Step 1: align SNPs and store per cohort, per chromosome.

Since one probably also wants to run a standard meta-analysis on the data, we will make use of METAL¹ - a program for meta-analysis that performs alignment - using the VERBOSE option to write all the aligned SNPs to a temporary file. See https://genome.sph.umich.edu/wiki/METAL_Documentation for documentation. Separating the chromosomes is done in this step to limit the number of SNPs that need to be kept in memory in subsequent steps. Input for this script are tempfile that contains the verbose output of METAL, and a textfile containing a list of the cohorts that went into that analysis. If you rerun the analysis (e.g. adding more cohorts) you should also rerun this alignment and separation of the chromosomes.

Call:

./separate_chr.do TRAIT Cohortlist.txt METAL_verbose_tempfile

After running this step, directory structure should be like this:

cohort_input_meta_regression/aligned_TRAIT
cohort_input_meta_regression/aligned_TRAIT/Cohort1
cohort_input_meta_regression/aligned_TRAIT/Cohort2
...

within each of these directories, there should be files per chromosome: Cohort1_meta_regression_TRAIT_chr1.gz to Cohort1_meta_regression_TRAIT_chr22.gz.

Files should look like this:

# 1:100000012	t	  g	    -72.943	148.537	0.6234	0.248	Cohort1_TRAIT_filtered_input.txt
# 1:100000827	t	  c	    -132.176	140.684	0.3475	0.295	Cohort1_TRAIT_filtered_input.txt
# 1:100001201	t	  g	    -193.362	225.048	0.3902	0.093	Cohort1_TRAIT_filtered_input.txt
# 1:100001671	ct        c	    -121.134	155.883	0.4371	0.263	Cohort1_TRAIT_filtered_input.txt
# 1:100002155	g	  gttagt    -192.873	225.047	0.3914	0.093	Cohort1_TRAIT_filtered_input.txt
# 1:100002713	t	  c	    -64.611	213.359	0.762	0.104	Cohort1_TRAIT_filtered_input.txt
...

Step 2: preprare the summary file

The summary file named "summary_measures_TRAIT.txt" contains the per-cohort information that is used in the meta-regression, age in this example. Make sure there are no missing values. All - and only - cohorts that are listed in this file are included in the meta-regression. For these cohorts, it is assumed that the aligned SNP data as described above is present. The (space separated) file should contain columns "cohort", "N" (indicating sample size) and "mean_age". Other columns are allowed but not used (unless the Meta_regression_GWAS script is adapted).

Example file:

cohort N mean_age
Cohort1 840 15.9
Cohort2 130 46.3
Cohort3 1095 87.0
...

Step 3: run the meta-regression

The meta-regression is run per chromosome, using the metafor² package in R³. Additional R packages needed are data.table, iterators and foreach. There are currently two types of meta-regression that can be run:

Effect_SNPC ~ b0 + b1*age + ε under the null hypothesis that b1=0 (1 degree of freedom), and
Effect_SNPC ~ b0 + b1*age + b2*(age)^2 + ε under the null hypothesis that (b1=b2=0, 2 degrees of freedom).

Call:

R --no-save < Meta_regression_GWAS.R --args TRAIT CHR TYPE

where TYPE is either "2_c1" or "3_c2". These numbers refer to the number of parameters in the model and the number of covariates that are hypothesised zero - "2_c1" refers to 1) above, testing for linear age effect versus no age effect, "3_c2" refers to 2) above, testing for quadratic age effects versus no age effect.

Notes:

Set your working directory in line 6
this script automatically filters out all the SNPs that are present in less than 50% of cohorts and in less than 50% of total number of subjects (lines 39-52). Adjust if necessary.
Depending on the number of cohorts and number of SNPs included, this script may take a couple of hours to complete. Ideally one would run each chromosome separately on a cluster.

Meta-regression output is stored in an .Rdata file per chromosome.

Step 4: combine the information per chromosome

Combine the information per chromosome for followup analyses. R packages needed are stringr and qqman. Call:

R --no-save < Combine_and_plot_meta_regression --args TRAIT TYPE

Notes:

Set your working directory in line 11
Set whether you want Manhattan and qq plots in line 5

Output of this script is similar to a GWAS summary stats file; but instead of effect sizes there are regression estimates and standard errors for the meta-regression (intercept and linear/quadratic components), a p-value for the age-effect (see step 3) and a p-value for residual heterogeneity. PLEASE NOTE: one cannot directly use these files in followup analysis tools that expect a (direction) of effect as these depend on age.

Example output:

markername chr bp allele1 allele2 P_age TotalSampleSize intercept se_intercept beta_age se_beta_age P_resid_heterogeneity
# 21:14601415 21 14601415 a g 0.90778 10626 -83.5286 100.9246 1.1524 1.9022 0.1651
# 21:14604190 21 14604190 t c 0.69777 10626 68.0641 110.7797 -0.8302 2.1392 0.8478
# 21:14608578 21 14608578 a t 0.8965 10626 66.2267 112.3745 -0.8694 2.1643 0.6244

References

¹ Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010). ² Viechtbauer, W. Journal of statistical software. J. Stat. Softw. 36, 1–48 (2010). ³ The R Core Team. R: A language and environment for statistical computing. (2018).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Combine_and_plot_meta_regress.R		Combine_and_plot_meta_regress.R
Meta_regression_GWAS.R		Meta_regression_GWAS.R
README.md		README.md
separate_chrs.do		separate_chrs.do

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GWAS_meta_regression

Prerequisites

Step 1: align SNPs and store per cohort, per chromosome.

Step 2: preprare the summary file

Step 3: run the meta-regression

Step 4: combine the information per chromosome

References

About

Uh oh!

Releases

Packages

Languages

RMBrouwer/GWAS_meta_regression

Folders and files

Latest commit

History

Repository files navigation

GWAS_meta_regression

Prerequisites

Step 1: align SNPs and store per cohort, per chromosome.

Step 2: preprare the summary file

Step 3: run the meta-regression

Step 4: combine the information per chromosome

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages