The initial attempt will be working with the Sourced dataset:
https://github.com/src-d/datasets/tree/master/ReviewComments
Follow instructions provided in their readme page.
Make sure the dataset is "installed" and "readable". You can follow the python script they suggest in their readme.