Skip to content

Errors I found in code and suggestions for further improvement #2

@qdread

Description

@qdread

Hi John,

I ran all the R code in the R_shell_code folder. I was able to reproduce almost all the output and visually compare the figures I generated with the figures in the manuscript, they looked identical. A few small errors and inconsistencies, I was able to fix myself and opened a pull request #1 to fix them. Below I am listing (1) some errors I was unable to fix that should be a quick fix (2) some stylistic recommendations for the code and (3) some things I noticed about the stats. -Quentin

(1) Small errors that can be fixed easily

  • In script 5, qq_df does not exist when it is called on about line 507. So, all future code involving the QQ plots does not run.
  • In script 11, many of the calls to plot_temp_col() return an error because newdata is not found
  • In script 11, many of the calls to plot_trial_init() return an error because new_scale_color is not found. (I just now remembered this is a function in ggnewscale so maybe that is what you are calling there and just didn't load the package?)
  • In script 11, when capture_data.RData is loaded, it contains the variable path which points to users/jgrady/something... and this overwrites the variable path which the user might have defined as pointing to the data path on their file system. So, code that later tries to import data from path will fail. You should probably re-save the .RData object without the path variable in it so that this doesn't happen.

(2) Stylistic recommendations for the code

  • Scripts that define functions should not also call functions and manipulate data. So I think that script 3 should be split into a script defining the functions, and a script that sets up the spatial data objects
  • Numbers should indicate the order in which scripts should be run by the user. If some scripts are not intended to be run by the user, don't give them numbers. So, scripts 3 and 7 that are sourced as part of the "pipeline" should be taken out of the consecutive numbering.

(3) Potential issues with the statistics

  • Many of the models fit in script 10 are "singular" meaning there is not enough information to fit random effect variances. Recommended solution is to use a Bayesian model with priors on the variance components. Alternatively, you can use glmmTMB to fit a frequentist model with priors.
  • Model selection by fitting every possible combination of parameters is not really the best approach. I would just fit the full model, or a few judiciously chosen submodels, and make inference on the different coefficient estimates

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions