Assignment title: Management


SOC 402 - R Homework Assignment #2 Regression and Correlation Due Date: Week 8, Monday, February 27, 2017 by 5:00pm Please submit your assignment to 6-23 Tory or 5-21 Tory Electronic lab submissions will not be accepted (hard copy only) Late submissions and submissions that do not follow directions will be penalized Directions: Use data from the NLSY97 Dataset (posted on the course website) to answer the following questions in a separate document. All assignments must be typed. The write-up portion of the assignment should be double-spaced, using 11 or 12 point font for the text. Please write in essay format for each question, but also indicate which questions you are answering. Answer the questions separately and number them, but write each answer in paragraph format. Tables should be neat and organized. Graphs should be correctly labeled. When discussing variables, make sure to include the correct units. Finally, don't forget to interpret your results and graphs. Spelling, grammar, organization, and mechanics will be graded, so make sure to proofread. Include your plots and your R Output below your answer for that question. You can do this by copying and pasting the R Output from the Console into your Word document. Your R Output does not have to be double spaced. For formatting purposes you should use \Courier" or \Courier New" as your font and 11pt or less as your font size for your R Output. Note: Some questions will require answers in paragraph form and some will require tables or graphs. Others will require only R Output. This should be obvious within the text of each question. 1SOC 402 - R Homework Assignment #2 Regression and Correlation You're a researcher who is interested in work, gender, and health. You're planning on studying the relationships that exist between these variables using the NLSY97 dataset and you're particularly interested in the factors that influence income. You begin your analysis by investigating the bivariate relationship between income (wage.inc) and hours of work (hrs.work). 1. Create and interpret a scatterplot between hours of work and income. What type of relationship might you be dealing with? Does the scatterplot indicate any potential problems in the data? 2. Calculate and interpret the correlation coefficient between hours of work and income. 3. Conduct a bivariate regression using hours of work to predict income. Interpret your results. 4. Create a second scatterplot between hours of work and income. Add a regression line this time. Page 2 of 3SOC 402 - R Homework Assignment #2 5. What happens to this relationship when you control for education, measured in years of education (grade1)? 6. These first models helped you to learn about the relationship between hours of work and income. However, you're really interested in income differences by gender. Using regression, assess the bivariate relationship between gender and income in the data. 7. In addition to gender, you know that other factors matter for predicting income. You decide to create a model that includes: gender, hours of work, education, and age as predictor variables. Use multiple regression to assess this relationship. Make sure to interpret your results. 8. Extra Credit: This question is optional. If you answer the question correctly you can earn extra points to make up for lost points on other questions, but you cannot earn more than 100% total on the assignment. The previous analyses focus on estimating income with a variety of predictor variables. You're also interested in how marital status and parenthood relate to income. Net of gender, hours of work, education, and age, do married or single individuals earn more? Net of gender, hours of work, education, and age, do individuals with or without children earn more? Use what you've learned about regression to answer this question. Page 3 of 3