MIS772 Predictive Analytics Individual Assignment A1 / Workshops M1T1-M1T4 1 of 7 Assignment A1: Predictive Statistical Modelling in R Student Name (as per record) Student No Exceptional Meets expectations Issues noted Improve Unacceptable Prepare Exec Report Prepare Data Discover Relationships Create Models Evaluate & Improve Provide Solution Research & Extend Brief Comments Total Executive summary (half page limit) This report is unique and is the result of individual effort by the author listed above. Any part of this report that bears resemblance to another students’ report will be treated as plagiarism. Ensure that all contents throughout needs to be readable and the font should be no smaller than Arial 10 points. In the report include here only those results that are most significant for your analysis and recommendations. Avoid indiscriminate “dumping” of tables, charts or code into this report – all content must have some purpose. Each chart, table or code snippet has to be described or used in the discussion. Make sure that all charts, tables and important results in the following pages are labelled for cross-referencing, e.g. “Figure 1 - Histogram of National Average Income” or “Table 4 – Comparison of model performance”. Then refer to them as “… (see Figure 1)” or “As shown in Table 4…”. Business Problem Aim 1: Succinctly state a business problem (or question) and specify requirements for its solution in terms of insights to be generated. Solution to Business Problem Aim 2: Succinctly describe the results (answer or solution) and justify. Provide references to the supporting evidence, e.g. charts and plots. Extension Clearly identify what kind of decisions are to be supported by the analytic solution and what types of actions can be recommended by the system. Do not attempt this extension unless the main objective has been achieved. If not attempting this section then delete it. Before entering your report text, delete all such instructions and clarifications as they unnecessarily take space.MIS772 Predictive Analytics Individual Assignment A1 / Workshops M1T1-M1T4 2 of 7 Data exploration and preparation in R (one page limit) Include here the text of your analysis with tables and plots, and if needed small parts of R code or a reference to the external R code. If analysis or results could only be determined by inspecting the code or running it, the marks will be reduced. All comments, such as this, which are not part of your submission can be deleted to save space. Expectation Understand what data is needed to solve the problem; select and extract 1-2 candidate targets and 5-9 candidate predictors; explore and understand characteristics of these variables, e.g. using scatter plots or lines charts, histograms or density curves, etc. Report all important insights. Extension Identify more variables, e.g. up to 3-5 candidate targets and 10-15 candidate predictors. Be selective in data visualisation. Do not attempt this extension unless the main objective has been achieved. If not attempting this section then delete it.MIS772 Predictive Analytics Individual Assignment A1 / Workshops M1T1-M1T4 3 of 7 Discovering Relationships and Data Transformation in R (one page limit) Include here the text of your analysis with tables and charts, and R code or a reference to the external R code. If analysis or results could only be determined by inspecting the code or running it, the marks will be reduced. All comments, such as this, which are not part of your submission can be deleted to save space. Expectation Explore, visualise and understand correlation between candidate variables; recommend and justify the selection of the most appropriate target variable and a subset of predictors to build an analytic solution in terms of relationships between them. Extension Identify 2-3 targets and 5-10 predictors. Transform these variables if needed. Note that it is likely that some variables will be eliminated in the process of correlation analysis. Do not attempt this extension unless the main objective has been achieved. If not attempting this section then delete it.MIS772 Predictive Analytics Individual Assignment A1 / Workshops M1T1-M1T4 4 of 7 Create Multiple Regression Model(s) in R (one page limit) Include here the text of your analysis with tables and charts, and R code or a reference to the external R code. If analysis or results could only be determined by inspecting the code or running it, the marks will be reduced. All comments, such as this, which are not part of your submission can be deleted to save space. Expectation Build a multiple regression model. Optimise it in respect of R-Squared, F-ratios and coefficient p-values. Model optimisation will determine variables. Briefly report intermediate steps taken and model characteristics. Extension Create 2-3 models, one for each target variable. The resulting number of variables will depend on your model optimisation process. Predict the likely values of your target variables for all countries in year 2020. Do not attempt this extension unless the main objective has been achieved. If not attempting this section then delete it.MIS772 Predictive Analytics Individual Assignment A1 / Workshops M1T1-M1T4 5 of 7 Evaluate and Improve the Model(s) in R (one page limit) Include here the text of your analysis with tables and charts, and R code or a reference to the external R code. If analysis or results could only be determined by inspecting the code or running it, the marks will be reduced. All comments, such as this, which are not part of your submission can be deleted to save space. Expectation Validate and test the model for its ability to predict target values; evaluate the model performance, e.g. in terms of accuracy, kappa, correlation of expected and obtained results, dollar value of error, etc. Interpret and report the results. Extension Deal with extreme cases using Cook distance. Eliminate multi-collinearities using VIF. Validate and test all models. Tabulate models performance before and after optimisation. Do not attempt this extension unless the main objective has been achieved. If not attempting this section then delete it.MIS772 Predictive Analytics Individual Assignment A1 / Workshops M1T1-M1T4 6 of 7 Provide an Integrated Solution in R (one page limit) Include here the text of your analysis with tables and charts, and R code or a reference to the external R code. If analysis or results could only be determined by inspecting the code or running it, the marks will be reduced. All comments, such as this, which are not part of your submission can be deleted to save space. Expectation Integrate all analytic elements into a process that could be used by the client to solve the WB problem, i.e. to read and transform data, create and validate the model, produce visualisations, tables and reports. Write the final report and recommendations. Extension Evaluate the final model using cross-validation, bagging or boosting, plot and interpret the model performance, e.g. using Gain, Lift, ROC or other appropriate charts. Do not attempt this extension unless the main objective has been achieved. If not attempting this section then delete it.MIS772 Predictive Analytics Individual Assignment A1 / Workshops M1T1-M1T4 7 of 7 Further Research and Extensions in R (one page limit) Include here the text of your analysis with tables and charts, and R code or a reference to the external R code. If analysis or results could only be determined by inspecting the code or running it, the marks will be reduced. All comments, such as this, which are not part of your submission can be deleted to save space. Expectation Extend your work with features well beyond what was covered in class, to improve the model and to present its results in the best way. Examples: report results on Google Maps or in Leaflet. Use stunning visualisations. Apply logistic regression, k-NN or Naïve Bayes models for additional insights. Extension Introduce a “Wow” factor. Report new and surprising insights. Deliver professional quality. Conduct independent research to determine if your predictions for 2020 confirm or extend previously published results. Do not attempt this extension unless the main objective has been achieved. If not attempting this section then delete it. Any materials, analysis or reports that do not fit into 7 (seven pages in total) will not be looked at or marked.