Assignment title: Information
PleaseFundThis.com is a web site that allows users to create projects in order to obtain crowdfunding for creative pursuits such as such as films, music, stage shows, comics, journalism, video games… and so on. Each project seeks monetary pledges from people. Most often projects are offered tangible rewards and one-of-a-kind experiences in exchange for their pledges. Similar examples: http://www.pozible.com/project/31823 https://www.gofundme.com/seespotshred https://www.kickstarter.com/projects/1926605606/12-is-better-than-6 Data from PleaseFundThis has been gathered and is available in the file named PleaseFundThis.xlsx (downloadable from blackboard). The columns within the spreadsheet are: • project_name • date_launched • duration_days • goal_$ • percent_raised • project_state • amt_pledged_$ • major_category • minor_category • project_updated_count • city • region • number_of_pledgers • comments_count • avg_amt$_per_pledger • project_has_video • project_has_facebook_page • facebook_friends_count • project_has_pledge_rewards • lowest_pledge_level_$ • highest_pledge_level_$ • total_count_of_pledge_levels • success Penny Robinson is a very talented young woman. She can act, sing, write, paint etc. In fact she can do anything creative. Penny's only problem is that she doesn't have any money. She wants to obtain money via crowdfunding to fund a creative project. She knows that you have a spreadsheet full of data. She want to know if you can give her any advice about creating a crowd-funded project based on the data if the file. E.g. What will make her project more likely to succeed? Your tasks is to • Explore the data. Find information that you think will be useful to Penny (and to other people who want to create similar crowd-funding projects). • Attempt to determine which (if any) what are the most important attributes of the project in terms of whether a projects succeeds (meets the required funds) • List and present the most important attributes and justify why this is so (with the use of BigML screen dumps and supporting explanation / discussion). You will need to do this based around two models. In the first model, use any / all of the columns in the dataset. However some columns probably aren't suitable for analysis. BigML may automatically choose some column not to be suitable (e.g. Project Id – as every row has a unique value, it provides no analytical use). In other case, you may need to select which columns are not suitable for analysis. In the second model, you must exclude these columns : percent_raised, amt_pledged_$, avg_amt$_per_pledge, project_state, number_of_pledgers & project_update_count Suggestion. Save the Excel file as a CSV file before loading into BigML. This generally allows column headings to be used field names rather than 'field1', 'field2' … Part 2 (1500-2500 words approx). Great Eastern University is a university located in Melbourne Australia. It has a student population of around 20,000 students (about 90% undergraduate). Many of the post graduate students are international students. Postgraduate students are an important funding area for the university. The university has an extensive data warehouse that has gathered data from a number of the university key IT systems. Spreadsheets are the main analysis tool used by the university. It is the number one tool used to prepare data for reports and to exchange data between managers and administrators. Julie Andrew, CIO of Great Eastern has recently attended a seminar on Big Data. She thinks that Great Eastern should move emphasis in this area. At the seminar tools such as PowerPivot, PowerBI, Tableau, BigML, Geospatial tools, Google Analytics and many others were mentioned. Julie knows that some of the biggest challenges facing the University are improving student retention, increasing enrolments (which have dropped in recent years), and improving the student experience (scores in students unit reviews have been dropping in recent semesters). Julie is hopes that some of the seminar topics will assist in providing solutions to the universities biggest challenges. A guest speaker at the conference was Ken Gruden (the Director of Analytics at Facebook). While discussing the topic of Big Data and Business Intelligence, Rudin mentioned that organisations must "focus on impacts, not insights". Julie needs additional information about Big Data and BI Tools before moving forward. Julie has contacted your team with the following question for you to answer. a. What benefits, if any, are there to Great Eastern to use data analysis tools such as PowerPivot, PowerBI or Tableau compared to Excel spreadsheets? b. What sort of data from the Great Eastern data warehouse could be used in a product such as BigML? What benefits might be realised? c. How could Great Eastern use Social Media, Social Media Analytic Tools and Geospatial tools to improve • The number of students enrolled in the university • Improve the student experience for all students • Increase student retention d. What is the meaning of the quote by Ken Rudin? e. What could Great Eastern do to ensure that it follows Ken suggestion?