Assignment title: Information
Assignment 1
CIS 2002
SEMESTER 1, 2016
Assignment 1 specification
Description Marks out of Wtg(%) Due date
ASSIGNMENT 1 100.00 7% ONC / 10% EXT 20 April 2016
IMPORTANT INFORMATION
You must submit the assignment electronically by the due date via the link on the study desk. Instructions will be provided on the course study desk.
You must use the official USQ data modelling and normalization methodology. This methodology is based on Clive Finkelstein's techniques (SR 2.1 and SR 2.2) and all the examples in the lectures, study book and the tutorials use this methodology.
If you do not use the USQ methodology, you will probably be awarded a mark of zero.
It is perfectly acceptable if you submit neat hand-drawn ERD's. Alternatively, you might wish to use Word. If you use a CASE or drawing tool, you must adapt the drawing to conform to the USQ methodology.
SECTION A (APPLIED DATABASE THEORY) (20 marks)
You are required to write an opinion article (also known as an "op-ed" piece) for a technology magazine on the following topic:
Big Data and Traditional Databases
Some experts argue that the reign of relational databases is finally over now that the amount of data that needs to be maintained has grown in the internet age. Others argue that big data analysis still heavily requires the concepts of relational databases and SQL experience.
The concept of Relational Database was developed in 1970s (described by Edger F. Codd and popularized C. J. Date). The relational databases and SQL was developed when data was assumed to be neat, structured and static.
Big Data is a broad term for comprehensive data sets that are huge in size, unstructured in format, highly recurring in frequency and volatile in nature.
Write an essay describe the relational model and some of the perceived limitations. In the essay list and describe at least four (4) possible alternatives to relational database model. For each alternative model you need to discuss some of the advantages and disadvantages over traditional relational model.
Your essay should be no longer than 500 words.
You are required to extensively research on the relevant topics, take a stance (role of relational databases in big data) and present concise and workable arguments.
There is no requirement for formal in-text referencing in this question. However, it is recommended to put a list of references at the end of the article showing the published materials that you researched while writing this essay (not part of the word count).
SECTION B (SQL) (40 marks 4 marks per question)
For each question, three marks will be awarded for the SQL and one mark for the correct output.
The following E-R diagram represents the JustLee Books database. The script for the table creation is located on the Moodle website for the CIS2002 course under the assignment specifications. You will need to make sure that you re-run the script to reset the tables to their default state.
In this question, you will use the JustLee books database. The JustLee database including appropriate data will be made available on the USQ Oracle server. You must use this data.
Full description of the JustLee database is found in Appendix A in the textbook on page 511. You may also get most of the details by using data dictionary on the oracle server.
Write SQL queries to solve the following specifications. Include the query AND THE OUTPUT. A screen dump of the output is acceptable. Show as many rows as you can. A screen dump is usually done via the ALT + PRNT SCRN command sequence or use the 'Snipping Tool' under windows to capture parts of the screen.
While the output helps to understand your solution, you should not be analyzing the output of the query in detail. As long as you are confident that your query corresponds to the question completely, your output may not be significant. You can have an output that says NO ROWS FOUND and it could be a perfectly valid output as long as your query fulfils the requirement. Explain why the query may not contain any output in this instance.
1. Display full name all the customers in a single column (by concatenating firstname and lastname) that have NOT been referred by existing customers identified by customer numbers 1003, 1004, 1005 and 1006. Your solution needs to use the IN operator.
2. List the names of all the states and region of customers currently living in the Northern region (i.e. region begins with 'N'). Each state should only appear once and the result should be by region name in ascending order.
3. Display the ISBN, title, category and discount of books in the BOOKS table where the book has a discount which is 5% of the retail price of the book.
4. Display the book title, cost, retail and calculate the profit and name the new field profit for all the books where the profit from the sale of the book is more than 65% of the cost of the book.
5. Display unique orders which have an order item for a book that has a discount value associated with it. Rename the field as 'ORDERS WITH DSCOUNTS'. Sort the output by order date of the selected orders ascending.
6. Display the ISBN, book title, cost and retail price of all the books published during the month of December in any year. Order the list by the book title.
7. Display the ISBN, quantity, price paid for the book (rename the column 'Paid Each') and the total price paid (i.e. quantity * 'Paid Each') for all orders in the ORDERITEMS table when the quantity is greater than 1.
8. Display publisher name, book title and retail, price for all the books after subtracting any available discount for that book, only for those books that have a discount. Name the new field 'DISCOUNTED PRICE' and order by publisher name ascending and discounted price of the book in descending order.
9. List ISBN, title, retail, category and published date of all the books that have the (category of 'COMPUTER' or where the category contains the string 'FAMILY' anywhere in the category description) AND where (price of the books is more than 20 dollars and where the published date is before the end of the financial year for 2003/2004 – i.e. 30 June 2004).
10. List the customer number, first name, last name, order number and order date for all the orders which have not been shipped. Order the resulting set by the order date in ascending order.
SECTION C (Data Modelling) (40 marks)
Construct data models for the following specifications. Include an ERD and a list of relations (entity list). Your relations must show all attributes, primary keys and foreign keys. You must use the USQ (Finkelstein) methodology as described in your Study Book, the lectures and the tutorials.
Question 1 (10)
A specialist clinic records patient details and their appointments. Each patient record consists of the patient id, name, address, contact number, date of birth and referring doctor name. When a patient books an appointment, the system records the date and time of the appointment, specialist doctor name, consultation duration and fees. Other details of specialist doctor include area of specialty and registration number.
Question 2 (10)
A pilot may be assigned to several flights as a crew member over time. The crew for a flight comprises a minimum of two pilots but could have up to four. For a pilot, we store the name, flying license number and birth date. For a flight, we store the flight code, origin location, destination location and type of aircraft. We also need to store the date a pilot is assigned to a flight, the role of the pilot and the total kilometres and hours the pilot has been flying in a flight.
Question 3 (10)
A truck driver may be assigned to many teams over time. A team comprises of a minimum of one driver but could have up to four. For driver, we store the name, license number and birth date. For team, we store the team identifier and base location. We also need to store the date a driver is assigned to a team, the date a driver leaves the team and the total kilometres the driver has driven within the team.
Question 4 (10)
A questionnaire is a research instrument consisting of a series of questions. We need to develop a model to store questionnaires and its underlying questions. A questionnaire records questionnaire name, primary contact name, start date and expected end date. A questionnaire may have many questions. Each question has a question number, question text and a category description for the question.
Prepare the following for both questions:
1 An ER diagram: Show all entities, relationships, cardinalities and optionalities. Also, include all intersection entities but do not create entities that are not specifically covered by the specification. You must use the Finkelstein methodology as per the study book and tutorials.
2 A list of relations (entity list). Produce complete relations for all entities and attributes. Show all primary and foreign keys. Include all attributes that are specifically mentioned and all key attributes. You may need to create primary and foreign keys that are not specifically mentioned but do not create any other additional attributes.
MARKING CRITERIA
SECTION A (20 marks)
1. Clearly listed and discussed other solutions for storing and retrieving data. :
(a) Discussed relational databases and SQL;
(b) Discussed advantages of other solution and for what purpose;
(c) Discussed limitations of other solution. (6)
2. Discussion shows understanding of different types of databases for different usage. (6)
3. Presented in easy to read and easy to understand manner. Solution is grammatically and structurally sound and presentation is neat and professional. (6)
4. Shown research source of information for the essay (2)
SECTION B (40 marks, 10 questions 4 marks each)
1. Three marks awarded for each correct SQL statement and one mark for the correct output.
2. Alternative approaches to the model answer will often be accepted unless they do not follow the rules / requirements set out in the specification, are poorly optimised or are poorly constructed (SQL) or utilise features of SQL not covered at this stage in the course.
3. Part marks may be awarded if an answer only has a small problem or an alternative solution is presented that works but is not ideally optimised / constructed.
SECTION C (40 marks, 4 questions 10 marks each)
1. Entities – no missing entities, appropriate names, no redundant entities, etc. (3)
2. Cardinalities and optionalities all shown and correct. (2)
3. Complete list of relations, showing all applicable attributes, primary keys and foreign keys. (4)
4. Sophistication: well presented solution; good layout; innovative approach; correct diagrams/notation; solution easy to read and understand; solution comprehensive (1)