-
Notifications
You must be signed in to change notification settings - Fork 15
Description
I understand that one of the Psych-DS principles is to maintain a copy of the "rawest" version of the data in the source_data folder. When we upload our datasets, should we strive to upload the "rawest" versions of the datasets that we contribute?
A bit of explanation -- the "rawest" version that I have of the nih-reviews project is actually a set of summary sheets contributed by our NIH reviewers. I have up to three summary sheets per reviewer in a variety of formats (.docx, .doc, .txt., and .pdf). However, specific reviewers are identifiable by name on summary sheets, so I didn't share the raw summary sheets when I posted this dataset at https://osf.io/c5csm/.
This dataset could provide a good case study for how to handle identifiable data in Psych-DS format. Mainly I'm wondering what would be most useful for pedagogical purposes (and for project development).