Assignment title: Management
What should be presented in your project presentation?
1. Explain about Your data
1-1) Your data source : Where your data was downloaded from
1-2) What did you have to do to download the data to your Hadoop
system: Which API or method, which tool, or whether you wrote
your own script/program to process?
1-3) Was there any issues you encountered?? How you resolved or
changed to a different data source? If then what was difference
between two data sets (structure wise, content wise, was it
simpler to get or required more complex API or processing?)
1-4) Show your original data format: Was it unstructured log file?
Jason format or XML, or any other semi structured?
1-5) Show your original data contents: What information were in your
data
2. Explain about your data processing method and your data format.
2-1) Whether you did data transformation process or not. If you did,
what data transformation you have to do to process your
unstructured/semi structured data to a structured file format
or any? Which API, Method, or Wrote your own script?
2-2) What is your transformed data format
2-3) Which information did you extract from your original data to
your structured/processed files?
2-4) Which information you want to find out from your data? That
is, what are your final data mining questions you want to query
over your data?3. Explain about your choice of the big data processing system
3-1) Which big data processing system you choose to use for
your transformed data? Why?
3-2) Overview or Explanation on your big data processing system
3-3) What was your input fomat
3-4) What was the output (result) or structure that your big data
processing system generate from your data?
Explain about your database scheme, Tables? Table
contents, Key?
3-5) Which process, steps, or what queries for the CRUD
operators for the database generated by your system?
4. Present your data mining result
4-1) What were your queries?
4-2) What information you found out
4-3) Present your result