This codebook describes the source files and processing steps used to analyze the Human Activity Recognition Data from University of California Irvine. The data was collected from smartphones' accelerometers and describes 6 different activities
The following files are used for the analysis:
features.txtcontains the names of the features (collected values)activity_labels.txtcontains the names of the 6 different activitiestrain/X_train.txtcontains the training data observationstrain/subject_train.txtcontains the ids of the subjects used for training datatrain/y_train.txtcontains the activity id related to the observationtest/X_test.txtcontains the test data observationstest/subject_test.txtcontains the ids of the subjects used for test datatest/y_test.txtcontains the activity id related to the observation
The original data set contains more files and information but only the files listed above where used in the analysis.
Steps:
- All the data files are read into data.tables
- Appropriate columns names are assigned from the
features.txtfile - All features that do not contain
meanorstdin their names are removed from the tables - The activities are labeled (using the activity Id and the corresponding text from
activity_labels.txt) - The 3 tables (
subject id,activityanddata) are bound together - The 2
testandtrainingdata partitions are merge together - A tidy data table is created with the mean for each feature, each subject and each activity.
- The tidy data table is written to a text file (CSV format)
tidy_data.txt
The analysis script run_analysis.md generates a new text file containing the mean foreach mean and standard feature, each subject and each activity for both training and test sets.