Skip to content

Adding a new model for a new task

Jethro Lee edited this page May 2, 2019 · 19 revisions

0. Outline

  1. Common-files ─ hBayesDM-models:
    • Provide example data for the new task
    • Write JSON file
    • Write Stan file
  2. R-related ─ hBayesDM:
    • Write R code for your new model
    • Write documentation
    • Add function to plotting.R
    • Run roxygenize()
  3. Python-related ─ hBayesDM-py:
    • Write preprocess_func for the new task
    • Auto-generate python file

1. Common-files ─ hBayesDM-models

In this section, you will write and commit the common-files regarding your new model, into hBayesDM-models.

Provide example data for the new task

You only have to do this once per every new task, whereas JSON and Stan files need to be written for each new model.

Start off by isolating the data columns you need for the modeling. Remove all the other columns in the example file, and change the names of the columns you will use to something representative but short.

Of course, because this is a hierarchical Bayesian modeling package that fits model-parameters of multiple subjects, you will always need a reserved column to specify the subject's ID for each row of data. Make sure to name this data column subjID. (This is the convention we use in hBayesDM.)

Currently, hBayesDM requires that the example file (and user data) follow a tab-separated format.

Refer to the other example files in directory hBayesDM-models/extdata/ for help.

The name of the example data file for your new model should follow the following format:

<task-name>_exampleData.txt

before it also gets committed to the (same) directory: hBayesDM-models/extdata/

cd hBayesDM-models
cd extdata

git add <task-name>_exampleData.txt
git commit -m "Your commit message here"

Ex)

git add gng_exampleData.txt
git commit -m "Commit example file for gng task"

Write JSON file

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. https://www.json.org/

The JSON file for your model is where you will specify the meaningful names (and some meaningful values) you intend to use in the code to come. What is the name of the task, name of the model; what are the names of the parameters of your model, what are their ranges? These identifiers (variable names) need to be used consistently throughout the Stan/R/python codes, both for the code to function correctly and to be readable. Think of the JSON file as a quick summary or specification for your model.

In fact, we're in the process of writing scripts that automatically generate R/python codes (that you previously would have had to write yourself) directly from the specified json file. This "automated code-writer" has already been implemented for the python version of hBayesDM, and is under development for R.

Write Stan file

Stan code is the workhorse of the sampling and model-fitting in hBayesDM.

For documentation and reference manuals to Stan, the probabilistic programming language, see:
https://mc-stan.org/users/documentation/

Your (newly written) Stan file should be committed to the directory: hBayesDM-models/stan_files/

cd hBayesDM-models
cd stan_files

git add <your-model-name>.stan
git commit -m "Your commit message here"

2. R-related ─ hBayesDM

In this section, you will create/update the necessary files for the R package hBayesDM.

Write R code for your new model

Write documentation

Add function to plotting.R

Run roxygen2::roxygenize()

roxygen2 helps developers with the petty details of managing an R package.
Once you've completed all the steps up to now, run roxygenize() by one the following methods:

Using RStudio (recommended)

Open the hBayesDM project via the hBayesDM.Rproj file in the repo directory.

> roxygen2::roxygenize()
Using console-R

cd hBayesDM
R
# this opens console-R

and then run:

> roxygen2::roxygenize()
Directly from terminal
cd hBayesDM
R -e 'roxygen2::roxygenize()'
# opens & executes on one go

If roxygen2::roxygenize() gives an error like the following:

Error in getDLLRegisteredRoutines.DLLInfo(dll, addNames = FALSE) : 
  must specify DLL via a “DLLInfo” object. See getLoadedDLLs()

Run this command first, then try again:

pkgbuild::compile_dll()

After running roxygenize() make sure that:

  • roxygenize() has not returned any errors.
  • The DESCRIPTION file has been updated to include your new model.
  • The NAMESPACE file has been updated to include your new model.
  • A new file man/<your-model-name>.Rd has been created.

The output messages from running roxygenize() will indicate each of these items.
You can also check for the latter three using git operations on your terminal:

$ git status

Changes not staged for commit:

        modified:   DESCRIPTION
        modified:   NAMESPACE

Untracked files:

        man/<your-model-name>.Rd
$ git diff

# Shows modifications to edited files