We made some standard points about not accepting the null hypothesis (in particular, not stressing comparisons between sig and nonsig).
We were pretty uncertain about a good, practical way to compare the strength of effects on different response variables.
BB and JD met after and were looking at multivariate linear models as a possible solution (for the case without link functions). We found an article from Fox et al. that lays out the underlying theory nicely. It also shows good examples of practice, but not for what we want (i.e., not for comparing between different response variables). In fact, it's not even clear if the theory laid out there supports this comparison.
Assuming we can deal appropriately with manova-like correlation, we still have a variety of possible approaches to dealing with Max's question.
The question is: how to compare the strength of effects on different outcomes?
The first step is to normalize the outcomes before feeding them to the model. The appropriate approach in this case is probably normalizing to unit variance.
Beyond that, we have several choices:
- Do 4C2 = 6 pairwise comparisons and do some sort of multiple comparisons correction
- Aim for an overall P value that targets differences in the coefficients. This is probably best done by looking at the null hypothesis that all are the same as the mean coefficient.
- ???
JD: I'm kind of leaning towards the second, because I think there's a relatively easy and robust way to do it (by fitting to some sort of residuals)
Sept 11th, 2020 (MF presenting on wanting to start some multivariate analysis on a series of morphometrics data.
- Suggested to MF to consider the dimensions (length vs. area vs. volume/mass) of variables and also scale before using these together.
- Common to log transform variables (in common dimensions) before using the PCA. check to see if PC1 is mostly size (all loadings the same size, approximatitely equal in magnitude, and whether it strongly correlates with length or mass).
Output from chat window 13:08:32 From ID : Some basics of fitting multivariate linear models and multivariate mixed models in this tutorial
13:08:34 From ID : https://mac-theobio.github.io/QMEE/Multivariate_responses.html
13:09:45 From BB : To (possibly) make life harder: can you say a little bit more about experimental design details, i.e. how are fish raised in batches? Presumably every egg doesn't have separate temperature control
13:12:08 From BB : so 100% pseudoreplicated ...
13:12:50 From RL(he/him) : Hmmm, I'm not thinking we should NOW adjust the covariate. I still get mixed up about direct and total effects; here we want to compare treatments after adjusting for hatch date, so that'd be using covariate in the standard way.
13:13:12 From BB : +1 to RL
13:14:27 From ID : https://github.com/DworkinLab/VirtualMorphoMeetup/blob/master/VirtualMorphMeet_April29_2020.md. this one is for using pairwise in RRPP (an extension of permmanova from adonis) that allows much more flexibility for multivariate linear models, including pulling out coefficients easily, R^2 etc..
13:15:24 From BB : Consider looking at other PCs, e.g. PC1 vs PC3? I know smaller PCs are more likely to be just noise, but it's also a little scary that 95% of the time people look only at PC1 vs PC2 ...
13:17:58 From BB: Josh Starmer, youtube, statsquest
13:18:43 From BB: if you're interested in linear algebra per se (i.e. what's an eigenvector), look for 3blue1brown (youtube???)
13:19:50 From ML : Cool. Excellent data-lunch to start off the term!
13:20:32 From ML : Have a good weekend!
13:20:38 From RL (he/him) : Enjoyable - thanks
13:20:39 From BB : bye!
- We discussed using (instead of sum to zero or treatment contrasts) the successive difference contrast coding which is available in the MASS library. See here
- We discussed the difference between transformations of the response data (like a log transformation of raw responses) and the use of link functions. Here is a useful link discussing the difference.
- We learned that the
plot.merMod()method takes acol=argument, which allows you to colour the plot by a column of your data - We learned that the
ranef()method of lme4 model objects can return a data frame for use with ggplot2.
- We discussed how to get contrasts of various kinds out of a binomial model using emmeans. Here is a link to the relevant place in the documentation.
Packages mentioned:
-
The
MCMCglmm()in the MCMCglmm package that allows you to fit multivariate mixed models which may be useful if you want to fit study as a random effect (if you can not aggregate measures within study). Theplotsubspace()may be useful for visual inspection of whether the covariance structures across your treatments are similar. You can also model seperate residual covariance structures inMCMCglmm. -
Someone mentioned the
adonis()function in vegan. Allows distance based 'MANOVA' like models assessing uncertainly with resampling. -
The geomorph package, while specifically designed for geometric morphometrics (shape) has a wide variety of functions for multivariate analysis. They have lots of information on their website. They have also have a couple of blog posts that I thought would be useful to you:
-
I have absolutely no idea if they are useful, but I do know that there are some packages designed for neuroimaging data.
-
For some of the resampling approaches (now mostly included in geomorph) we have written our custom functions. It may be easiest to speak directly so I can get you the right source files, but various iterations of these are in our source scripts and can be found on github or dryad:
googling "R isosurface" gets the misc3d package (which is what I had in mind) and http://www.jstatsoft.org/v28/i01/paper, which refers to the contour3d function in the misc3d package. It even has a brain PET scan example ..
B.B. also mentioned the paper by Paul Murtaugh SIMPLICITY AND COMPLEXITY IN ECOLOGICAL DATA ANALYSIS