diff --git a/README.md b/README.md index 8c1422f..597952c 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,6 @@ Repository Link: https://github.com/tensorflow/models/tree/master/research/texts Dataset can be obtained here: CNN and DailyMail stories http://cs.nyu.edu/~kcho/DMQA/ - Working: The valid data format requires article and abstract key for the TextSum algorithm to train and decode. @@ -16,15 +15,12 @@ abstract is extracted using all @highlights in data. Vocabulary with 200000 words include UNK and PAD tokens are generated. - - Usage: -CNN and DailyMail data should be present in %pwd%/cnn/stories and %pwd%/dailymail/stories +CNN and DailyMail data should be present in `%pwd%/cnn/stories` and `%pwd%/dailymail/stories` -run mkdir data in the present working directory +`run mkdir data in the present working directory` You can opt for generating both Datasets or one of them using the following arguments- -run python convertdata.py --both or --CNN or --DM - +`run python convertdata.py --both or --CNN or --DM`