DATA WRANGLING

Knit your assignment as .html file (not .docx) file. Submit both .html and .rmd files to blackboard. Before your get started, you will have to download “hwk4_data.zip” from Blackboard and unzip all data files to your working directory. Q1: You will use “study_hours.csv” and “study_hours.csv” for this question. (1) Read “study_hours.csv” into a data frame named study_hours . (2) Use function gather to reshape columns “Mon”, “Tue”, “Wed”, “Thu”, “Fri”, “Sat”, and “Sun” into columns “Day”, containing day of the week, and “Hours”, containing the number of hours studied on that day. Name the resulting data frame study_hours_reshaped . Hint : when calling gather , specify key as “Day”, and value as “Hours”. (3) Print the dimensions of study_hours_reshaped using function dim . Hint : it should have 42 rows and 5 columns. (4) Read student_info.csv into a data frame named student_info . (5) Combine study_hours_reshaped and student_info by column “Name” into new data frame student_full , using function full_join . (6) Print a brief overview of student_full using function glimpse . Q2: You will use “worldbankhealth.csv” for this question. Note that it might run slow due to the file size. (1) Read “worldbankhealth.csv” into a data frame named wbh_data_raw . (2) Create data frame wbh_data containing only columns “country_code”, “indicator_code”, “year”, and “value” of wbh_data_raw. (3) Use function spread to reshape column “indicator_code” of wbh_data into multiple columns. Name the resulting data frame wbh_data_reshaped . Hint : when calling spread , specify key as “indicator_code”, and value as “value”. (4) Print the dimensions of wbh_data_reshaped using function dim .