Skip to content

Support to make use of third party R-scripts #14

@wahani

Description

@wahani

In #13 and #9 we see the use-case that authors will use modules to safely load/source code from a third party (a script from someone else). This third party code needs to be integrated into a local R session without braking anything. With base::library and an arbitrary number of scripts and authors this is not so straight forward due to naming conflicts. Two sources:

  • local variables which are named the same in different scripts, e.g. data and file. This is easy to fix with source(..., local = new.env()) and/or modules.
  • naming conflicts from attaching packages, e.g. stats::lag and dplyr::lag. To solve this we have to find a hack (like in the issues above) or fix it manually by replacing calls to library with modules::import.

In this scenario the authors of the third party code are not necessarily aware of the modules package. In an ideal world a solution would look as simple as:

m <- modules::use(thirdPartyScript, safeSource = TRUE)

This will source thirdPartyScript and prevent changes to the search path and maybe the workspace. Some thoughts and mostly problems with potential solutions:

  • It is possible to mask library with modules::import and hope that this works. library has more than one argument, so it is not as simple as that but may work for a lot of scenarios; until it doesn't. A different problem we would need to solve is, that there are still some packages that depend on other packages. I.e. we load package A with library but packages B and C are also attached to the search path. This was the mechanism prior to namespaces. Masking library also does not solve the use of require which is also often used instead of library.
  • Changes to the workspace can only be prevented when we compile the module in a background process. There are so many ways to mess up the workspace that anything else would be extremely tedious. E.g. we may change the workspace using assign accessing it with .GlobalEnv, sourcing some other script with source, load and so on. And still, if someone is relying on a reference like .GlobalEnv it would break the code, because these values would only exist in the background process and would be lost.
  • Users will expect that packages like stats are simply available. At least this we can solve easily by giving a helper to use stats as the top enclosing environment of a module.

To solve these issues in a reliable way requires quite some development time. And still a 80% solution is straight forward. For the moment we may simply document this 80% solution and leave it to the users to adopt it according to their use cases. We can also reference this issue from the docs so that there is a point of reference for discussion and information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions