summarisation-scripts

Scripts to process the jst data into a more meaningful and readable format

Prerequisites

Required installations

Python 3.6
Data must be first collected, processed and analysed by the following scripts listed in order:

Follow the instructions of the above mentioned packages to run them.

Other requirements

Sample config file available.

Need to create a data directory for all input and results folder for the output

Input files

The summarisations uses for input 5 items: 4 text files: documentThetha, documentPi, topicWords and topicSentences and a folder with raw-tweets.

The four text files mentioned above are copied from the jst final results (final.thetha, final.pi, final.twords and final.topSentences) and pasted and saved in txt format with the above mentioned names. The names are only specified in the config file, however keeping to the same names makes it less time consuming as there's no need to touch the config file.

Note: do not change the extention of the jst files themselves, as this ruins their formatting, just copy the contents and paste them in a new file.

The raw-tweeets folder is a subdirectory of the data folder which contains raw texts generated by the raw-tweets.py script in pyMysql.

Running the code

Open console and navigate to the directory of the summarisation-scripts package. When you get there type:

summary_of_topics.py

Note: After each run of the script make sure to copy the result files to a different directory or give them a relevant name as currently the code cannot generate useful names and it simply overwrites the result files.

Output files

json file to be used for visualisation of summaries in html
text file of all topic summaries
spreadsheet of all topic summaries, ordered by topic importance
spreadsheet of all topics, ordered by importance
csv file of all topics

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
config.py		config.py
config.txt		config.txt
docParsingFunctions.py		docParsingFunctions.py
docTopics.py		docTopics.py
generateSpreadsheets.py		generateSpreadsheets.py
mapping.py		mapping.py
similarityCheck.py		similarityCheck.py
summary_of_topics.py		summary_of_topics.py
topicImportance.py		topicImportance.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

summarisation-scripts

Prerequisites

Required installations

Other requirements

Input files

Running the code

Output files

About

Uh oh!

Releases

Packages

Languages

FoodSentimentObservatory/summarisation-scripts

Folders and files

Latest commit

History

Repository files navigation

summarisation-scripts

Prerequisites

Required installations

Other requirements

Input files

Running the code

Output files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages