Assignment title: Management


Q1. plot(d1$hrs.work, d1$wage.inc, main="Income by Hours of Work", xlab="Hours of Work", ylab="Income") Q2. > cor(d1$hrs.work,d1$wage.inc) [1] 0.5165685 Discussion: Correlation coefficient between income and hours of work =0.52, which means that there is a moderate and positive correlation between these variables. Q3. reg1<-lm(d1$wage.inc~d1$hrs.work) summary(reg1) Call: lm(formula = d1$wage.inc ~ d1$hrs.work) Residuals: Min 1Q Median 3Q Max -93552 -11721 -6712 7239 138244 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 7757.9305 584.2295 13.28 <2e-16 *** d1$hrs.work 15.5174 0.3595 43.17 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 23410 on 5120 degrees of freedom Multiple R-squared: 0.2668, Adjusted R-squared: 0.2667 F-statistic: 1863 on 1 and 5120 DF, p-value: < 2.2e-16 Discussion: y=7758+16x. A slope of 16 means that 1-hour increase in an employee's amount of working hours is associated with an average increase in annual income of $16. A y-intercept of 7758 suggests that the expected income for a person with 0 hours of work should be $7758. Moreover, from the data t=43.17 and p-value=2.2*10^-16, we can conclude that the slope for hours of work is significantly different from zero at p<0.001. An R^2 of 0.2668 means that when predicting a person's total income, we will make 27% fewer errors by basing the predictions on the person's hours of work and predicting from the regression line, as opposed to ignoring this variable and predicting the mean of income for every case. Hours of work(X) explain 27% of the variation in income(Y) among city employees. Q4. par(mar=c(5,6,4,2)) plot(d1$hrs.work,d1$wage.inc,main="Scatterplot of Hours of Work and Income",xlab="Hours of Works",ylab="",xlim=c(0,8000),ylim=c(0,150000),col="blue",axes=F) axis(side=1,at=seq(0,10000,2000),labels=seq(0,10000,2000),cex=0.8) axis(side=2,at=seq(0,150000,30000),las=2,cex=0.8,labels=c("0","30000","60000","90000","12000","150000")) mtext(side=2,"Income(Dollars)",line=4) abline(reg1,col="green") Discussion: The regression line indicates the strength and direction of a relationship between two variables. In this case, we can conclude that there is a moderate and positive relationship between income and hours of work. Q5. reg2<-lm(d1$wage.inc~d1$hrs.work+d1$grade1) summary(reg2) Call: lm(formula = d1$wage.inc ~ d1$hrs.work + d1$grade1) Residuals: Min 1Q Median 3Q Max -74193 -12176 -3834 6953 139143 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.084e+04 1.554e+03 -13.41 <2e-16 *** d1$hrs.work 1.402e+01 3.547e-01 39.52 <2e-16 *** d1$grade1 2.215e+03 1.122e+02 19.75 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 22570 on 5119 degrees of freedom Multiple R-squared: 0.3187, Adjusted R-squared: 0.3185 F-statistic: 1198 on 2 and 5119 DF, p-value: < 2.2e-16 Discussion: y=-2.084e+04+(1.402e+01)+(2.215e+03) A slope of 1.402e+01 means that 1-hour increase in an employee's amount of working hours is associated with an average increase in annual income of $1.402e+01, net of the effects of education on income. A slope of 2.215e+03 means that a 1-year increase in an employee's amount of formal education is associated with an average increase in annual income of $2.215e+03, net of the effects of hours of work on income. A y-intercept of -2.084e+04 suggests that the expected income for a person with 0 years of education and 0 hours of work would be $-2.084e+04. Can conclude that the slope for hours of work and income, while controlling for education, is significantly different from zero at p<0.001. Moreover, we can also conclude that the slope for education and income, while controlling for education is significantly different from zero at p<0.001. R^2 of 0.3187 mean that together, hours of work and education explain 32% of the variation in income. Q6. reg3<-lm(d1$wage.inc~d1$gender) summary(reg3) Call: lm(formula = d1$wage.inc ~ d1$gender) Residuals: Min 1Q Median 3Q Max -34097 -21439 -4097 12111 122563 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 44755.3 1192.9 37.52 <2e-16 *** d1$gender -10658.1 749.6 -14.22 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 26820 on 5120 degrees of freedom Multiple R-squared: 0.03798, Adjusted R-squared: 0.03779 F-statistic: 202.1 on 1 and 5120 DF, p-value: < 2.2e-16 Discussion: y=44755+(-10658x) A slope of -10658 means that female employee have an annual income that is 10658 less than male employee. A y-intercept of 44755 suggests that the expected income for a female employee would be $44755. Moreover, from the data t=-14.22 and p-value=2.2e-16, we can conclude that the slope for gender is significantly different zero at o<0.001. An R^2 of 0.03798 means that when predicting a person's total income, we will make 4% fewer errors by basing the predictions on the person's gender and predicting from the regression line, as opposed to ignoring this variable and predicting the mean of income for every case. Gender(X) explains 4% of the variation in income (Y) among city employees. summary(reg3) Q7. reg4<-lm(d1$wage.inc~d1$hrs.work+d1$gender+d1$grade1+d1$age) summary(reg4) Call: lm(formula = d1$wage.inc ~ d1$hrs.work + d1$gender + d1$grade1 + d1$age) Residuals: Min 1Q Median 3Q Max -71631 -12172 -3242 7332 135793 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -5.691e+04 6.371e+03 -8.933 < 2e-16 *** d1$hrs.work 1.303e+01 3.577e-01 36.424 < 2e-16 *** d1$gender -7.139e+03 6.377e+02 -11.195 < 2e-16 *** d1$grade1 2.383e+03 1.112e+02 21.431 < 2e-16 *** d1$age 1.593e+03 2.132e+02 7.476 8.99e-14 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 22190 on 5117 degrees of freedom Multiple R-squared: 0.342, Adjusted R-squared: 0.3415 F-statistic: 665 on 4 and 5117 DF, p-value: < 2.2e-16 Discussion: y=-5.691e+04+(1.303e+01 hours of work)+(-7.139e+03 gender)+(2.383e+03 education)+(1.593e+03 age) A slope of 1.303e+01 means that a 1-hour increase in an employee's amount of working hours is associated with an average increase in annual income of $1.303e+01, regardless of gender, education or age. A slope of -7.139e+03 means that female employees have an annual income that is $-7.139e+03 lower than male employees, net of effects of education, hours of work or age. A slope of 2.383e+03 means that 1-year increase in an employee's amount of formal education is associated with an average increase in annual income of $2.383e+03, net of the effects of gender, age or hours of work. A slope of 1.593e+03 means that 1-year increase in employee's age is associated with an average increase in annual income of $1.593e+03, net of effects of gender, hours of work or education. A y-intercept of -5.691e+04 suggests that the expected income for a female employee with 0 years of education and 0 years old and 0 hours of work would be $-5.691e+04. Can conclude that all partial slopes are significantly different from zero at p<0.001. R^2 of 0.342 means that together, education, hours of work, gender, and age explain 34% of the variation in income.