Assignment title: Information


Data Warehouse Project 75 POINTS DUE: See Syllabus. THIS IS AN INDIVIDUAL ASSIGNMENT

Overview This assignment provides a hands-on application of the concepts related to creating and populating a data warehouse. It involves a series of activities from thinking from a user/management perspective to determine the most important questions in a business situation, through design of a data warehouse, writing SQL to extract data from existing operational databases and transforming it into the data warehouse format, to the SQL that uses the data warehouse to answer the questions proposed.

Situation Description You work for a large corporation that has just purchased 2 hotel and resort corporations each consisting of over 100 hotels. Each Corporation operates a custom database. You are provided the data dictionary and ER diagrams for the two operational databases. Management would like you to design a data warehouse that allows them to achieve the most competitive advantage possible.

  To achieve this you decide to do the following: 1 Review the Hokie Resort situation description, the ER diagrams for the two operational databases, and the data dictionaries for the two systems. 2 Write the three most important questions that management must answer to achieve a competitive advantage in the hotel market. Submit the questions and a brief (1-2 pages) explanation/justification of why these are the most important questions. (10 points) 3 Design a Data Warehouse Star or Snowflake schema that is sufficient for addressing these questions. Submit an ER Model and the Oracle/MySQL/SQL Server DDL* that implements your data warehouse schema (Note: you do not have to implement the schema in a DBMS). (15 points) 4 Analyze the ER diagram and data dictionary from both of the operational databases to determine if the two operational hotel databases have the data needed for your data warehouse. For each DB, create a mapping that shows the tables from that DB that are used to create rows in your data warehouse tables. For each data warehouse table, describe how the operational data is aggregated to create a row in the table. Submit your mapping and aggregation summary in the following format.

Datawarehouse Table Operational DB Table Aggregation/Sum PatientDim Corp1: Patient No aggregation, each row is an instance in the DW.

PatientDim Corp2: Customer No aggregation, each row is an instance in the DW. MediationsFact Corp1: Prescriptions Count and average amount of drug given are created.

MediationsFact Corp2: DrugsProvided Count and average amount of drug given are created.

(Note: If an operational database does not contain the data needed for your data warehouse design, then propose revisions to the existing tables in the DB or define additional tables to be populated in the DB so that it will contain the data needed for your data warehouse). (10 points) 5 Write the SQL necessary to extract and process the data from the two operational databases so that it will be suitable for your data warehouse. The mapping you made in question 4 should help in this process. This requires writing SQL procedures that include SELECT statements from the operational DBs and INSERT INTO the data warehouse tables (Note: you do not have to make the procedures work). Be sure that each data warehouse fact and dimension table are extracted from each operational DB (Hint: Don't forget the time dimension). Pay attention to the correct grouping and aggregation necessary to transform the operational data into the form needed for your data warehouse. An example of the procedure file is provided from a previous semester using the Hokie Hospital problem. Your task is to make a similar procedure that will extract the data from the operational databases into your data warehouse design. Submit your procedures with a .sql file extension so that we can review them within. While any logical organization of procedures is acceptable (provide a brief justification/rationale for others), the preferred approach is to create one SQL procedure for each dimension or fact table in your data warehouse. These procedures should accesses both operational DBs and be as simple as possible. (20 points) 6 Explain how users will query the data warehouse to easily determine the answer to the three most important questions (from step 1). Be sure to use language that the users of the system will understand. (10 points) 7 Write the SQL that will be needed to answer the three most important questions using your data warehouse. This requires writing 3 SQL statements/procedures. (10 points)

* You should download and install Notepad++ or other editor that helps with SQL syntax.