The data used for this project represent data collected from the accelerometers from the Samsung Galaxy S smartphone. A full description is available at the site where the data was originally obtained:
http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
The specific data for this project can be found here: https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
The R script in this project will produce two tidy, well-labelled, descriptive data sets from the combination of the training and test data sets:
- one that contains any mean and standard deviation values for each measurement
- one that contains the average of each variable for each activity and each subject
The second resulting data set will be written to a file named means_by_activity_and_subject.txt in the current working directory.
Note, the first resulting data set includes values produced by mean(), meanFreq(), and std(). meanFreq() is included because the assignment specification is vague, and the function starts with "mean."
The script performs the following general steps:
- Download and unzip the source data.
- Load all the required data files into memory.
- Produce a data frame "mean_and_std_measurements" that only includes measurement values from the original data set that contain one of "mean()", "std()", or "meanFreq()" in their names.
- Produce a data frame "means_by_activity_and_subject" that includes the mean of all measurement values from the original data set, averaged for each combination of activity and subject.
- Write means_by_activity_and_subject into the file means_by_activity_and_subject.txt.
The data dictionary can be found in CodeBook.md.
source('run_analysis.R')
- reshape2 package