Skip to content

Feature roadmap #36

@danielnachun

Description

@danielnachun

As our work on this package progress, this issue can help us enumerate possible future features of the package depending on the time and interests of contributors. Some features will be needed for the manuscript submission, and others will make more sense to consider for future releases.

TWAS

Individual

  • normal prior with SuSiE
  • mr.ash
  • elastic net with glmnet
  • LASSO with glmnet
  • Bayesian alphabet from qbayes
  • Rcpp wrapper for Dirichlet process regression (manuscript, GitHub)
  • MCP and SCAD from ncvreg
  • L0Learn from L0Learn
  • BayesB and Bayesian Lasso from BGLR

Summary

  • normal prior with SuSiE
  • mr.ash
  • Bayesian alphabet from qbayes
  • Rcpp reimplementation of existing summary-based PRS-cs (see links above)
  • Rcpp wrapper for summary-based Dirichlet process regression (manuscript, GitHub)
  • lassosum for LASSO and elastic net (manuscript, GitHub)

Longer term

  • Determine if the continuous shrinkage prior in PRS-cs (manuscript, GitHub) can be extended from summary statistics to individual-level data, and implement in Rcpp.
  • explore feasibility of ncvreg, L0Learn, and BGLR for summary data - might be a lot of work for little gain if mr.ash generalizes all of these
  • Extension to genome-wide TWAS (this will be a separate manuscript) - see discussion about genome-wide extension for MR and polygenic risk scores.
  • Extend mr.ash to work with other ebnm priors - deconvolveR is most interesting because it is a smooth approximation of NPMLE instead of a scale mixture of normals

Mendelian randomization

  • Egger regression as an additional horizontal pleiotropy test to complement heterogeneity tests - only useful with enough independent instruments
  • EDIT: verify that this is not already how we are doing MR "Omnigenic model" that incorporates all variants as instruments (this will be a separate manuscript, possibly in combination with the trans-QTL extension) - inspired by [OMR] (https://academic.oup.com/bib/article/22/6/bbab322/6347949), and could exploit the fact that SuSiE gives us posterior effect sizes and standard errors, unlike most other fine mapping methods.
    • How will this method handle weak instrument bias without removing variants - consider debiasing estimators like dIVW and pIVW - does OMR have this issue too?
    • Should show that SuSiE does a comparable job in terms of adjusting for LD as LD scores (used by OMR and MRAID, and the variant selection methods used by MR.LDP, MR-Corr2 and MR-CUE.
    • Are the heterogeneity tests and Egger regression still valid for testing for horizontal pleiotropy in the presence of so many weak instruments?
  • Extension to genome-wide analysis with trans-QTLs (this will definitely be a separate manuscript):
    • Proper handling of correlated horizontal pleiotropy (CHP) is critical. The most conservative existing approach is to just remove pleiotropic variants - other solutions are provided by cause, MRAID, MR-Corr2, MR-CUE and MRcML. See also this review, which does not include some of the more recent methods but does discuss CHP.
    • Can we estimate CHP in trans-QTLs by looking at the effect of the same variant across all tested molecular traits?
    • Existing methods for handling CHP do not seem to use empirical Bayes methods - can we use SuSiE to help us do this?

Colocalization

  • Other model of colocalization (this will definitely be a separate manuscript) - can we treat gene-level colocalization as a Kullbeik-Leibler divergence between two multivariate normal distributions? Can we penalize this divergence for LD using the entropy of the distribution of the (top or all?) eigenvalues of the LD matrix? How does this compare to correlating PIPs?

Polygenic molecular risk scores (PMRS)

  • SuSiE model is ready made for prediction of molecular trait - make it easy to do predictions from new genotype data
    • Genome-wide prediction will have the same concerns about CHP - this doesn't matter for predicting traits from PMRS, but does matter for model interpretation
    • Could also use SuSiE to predict traits from PMRS - similar idea to CTWAS, definitely a separate manuscript and probably a separate package, would want to extend to survival models.
  • mr.ash and other penalized regression methods can be used for prediction for genome-wide TWAS but not MR, because penalized regression doesn't produce valid standard errors

Interfaces with other packages

  • mvsusier/mvsusiF
    • Straight forward for integrating with TWAS and MR - we use just the posterior effect size estimates as we normally do
    • Challenging for colocalization - colocBoost is the current solution, hopefully we can figure something out here later
  • susiF
  • vignette for INTACT
  • vignette for CTWAS - currently challenging to run CTWAS

Other

  • Easy approach to adjust fine mapping to remove variants that were not tested in the GWAS but were tested in the QTL - this doesn't work for TWAS with penalized regression!
  • Vignette on imputing GWAS summary statistics (and QTL summary statistics if not using individual level QTL data) - this would ideally be tied to future efforts to improve this approach methodologically.
  • Data package for LD blocks for GWAS fine mapping
    • What about windows for QTL summary stats?
    • Could pre-computed LD windows be stored on queryable server? Alternatively, could download 1000 Genomes population as a reference, and compute LD matrix for user?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions