GitHub - chiauho/Course-Project: Course Project on Getting and Cleaning Data

run_analysis.R
Program submitted for course project
Submitted by: Chiau Ho ONG

The run_analysis.R program does the following:

Steps 1 to 8 gets us the data set required for step 4 of the assignment

Reads in the test and train data sets ("X_test.txt" and "X_train.txt")
Combines the two data sets into one (total_data)
Reads in the descriptions of each column (variables) from file "features.txt". This file contains the description of each column of the test and train data sets.
Give meaningful description to each column of total_data by naming each column of total_data with variables read from the "features.txt". The number of columns and number of variables matches up nicely
Now extract the columns that records the mean and std measurements only. Ignore all other columns. The data frame total_data now contains only columns with mean and std measurements.
Now reads in subject id and activities files ("subject_test.txt", "y_test.txt", "subject_train.txt" & "y_train.txt"). Merge Subject_test & Subject_train into one. Merge y_test & y_train into one.
Now append subject and y (activities) as columns 1 and 2 to total_data. The new sets is c_total_data. Give meaningful column names to Subject & activities
Finally give meaningful activity names to the data in y. c_total_data is the required data set in step 4 of the assignment.

Now for step 5 - create a tidy data set with average of each variable for each activity and each subject:

Use the plyr package. ddply function is handy. This function takes in a data frame and returns a data frame.
Apply it to c_total_data data frame. Tells ddply to split by Subject_ID and Activities. Then apply mean to each column based on Subject_ID and Activity type for that Subject_ID. For example Subject_ID 1 & Activity STANDING, return the mean of each column.
Change the variable names to reflect that they are now the mean values
Finally write the resultant data frame to a txt file "tidy_data.txt"

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
README.md		README.md
code book.md		code book.md
run_analysis.R		run_analysis.R

Provide feedback