HotData

Written for CS 491 by Derek Stratton and Nick Harris

Useful Things:

Label each request in trace.csv as "hot" or "cold" based on if the FileObject associated is hot or cold

run through trace.csv and count the number of requests associated with each file FileObject (dictionary)
- done
in the set of all request counts, find the 80th percentile (the 80% smallest on one side, the 20% biggest on the other)
- done
associate each FileObject with "hot" if its in the top 20% and "cold" if its in the bottom 80%
- done
go back thru the trace.csv and append "hot" or "cold", now that each FileObject has been recognized
- done

Set up estimator for TensorFlow with python, using premade_estimator.py as a template for supervised learning Model

Import and parse the data
Create feature columns to describe the data
Train the data (let's use maybe 70-80% of the data from tracesample.csv)
Evaluate the model (maybe use a different 10-20% of the data from tracesample.csv)
Make sure the predictor works reasonably

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
HotData-0.2		HotData-0.2
sampledata		sampledata
README.md		README.md
datasetup.py		datasetup.py
iris_data.py		iris_data.py
iris_data.pyc		iris_data.pyc
premade_estimator.py		premade_estimator.py