Assignment title: Information
MMIS 692: Capstone Project in Business Intelligence
Suggestions on Mid-Term Report.
You must include the following in your mid-term report:
1. LP formulation for the production planning problem (as required in Task 1).
2. SQL query to obtain the average machining time, assembly time, finishing time, and regular
production cost per unit for each product type as estimates of the parameters , , , and of the LP model
(as required in Task 2).
3. SQL queries to obtain data for the 3 regressions and regression coefficient estimates , , , and for each
product as required in task 3.
4. Special run cost estimates for producing each unit of product in special runs as required in task 4.
5. SQL queries to obtain data to predict demands for the four products as required in task 5.
6. Predicted demands for the four products in period 41 as required in task 6.
2
Task1. LP formulation:
LP model uses the following parameters:
is the demand (number of units required) for product .
is the cost (in dollars) for producing each unit of product in a regular run.
is the cost (in dollars) for producing each unit of product in a special run.
is the machining time (in minutes) required to produce each unit of product .
is the assembly time (in minutes) required to produce each unit of product .
is the finishing time (in minutes) required to produce each unit of product .
Decision Variables:
: Number of units of produced through regular run.
: Number of units of produced through regular run.
: Number of units of produced through regular run.
: Number of units of produced through regular run.
: Number of units of produced through special run.
: Number of units of produced through special run.
: Number of units of produced through special run.
: Number of units of produced through special run.
Objective function:
Minimize total production cost = .
Hence the objective function must have 8 terms (4 for regular run costs, plus 4 for special run costs)
Cost=
Constraints:
We must ensure that demands for all four products are satisfied.
Demand for :
Demand for :
Demand for :
Demand for :
We must also ensure that time used in regular production does not exceed available time.
Machining time constraint:
Assembly time constraint:
Finishing time constraint:
All decision variables are non-negative.
Note that you should NOT substitute values for the parameters at this stage. Parameter values will be
estimated in tasks 2, 4, and 6. You should substitute the estimated values to solve the LP in task 7 (in your
final report).
3
Task 2. SQL queries to estimate parameters , , , and of the LP model.
You may use any DBMS that you are familiar with for this task.
After loading the data into the production table as specified in the requirements document, I used the
following query to estimate the required parameters:
select productcode,
avg(machinetime) as mtime,
avg(assemblytime) as atime,
avg(finishtime) as ftime,
avg(cost) as regcost
from production
group by productcode
order by productcode;
Present the results of the query in the form of the following table.
Round your estimates to 1 decimal place:
Estimates S1 S2 S3 S4
Machine Time ()
Assembly Time ()
Finish Time ()
Regular Cost ()
4
Task 3. Estimates for the coefficients , , , and for each product .
We assume that regular production cost is a linear function of the machining, assembly, and finishing
times for each product type. That is, , where is the fixed cost per unit, and , , and are respectively the
costs per minute for machining, assembly, and finishing each unit of product type in a regular run.
We first obtain data to run the regression using the following SQL command for product 'S1':
select MachineTime, AssemblyTime, FinishTime, Cost as costAS1
from production
where productcode = 'S1';
Similarly, run SQL queries to obtain data to run regressions for the other 3 products.
Next run regression of Cost on MachineTime, AssemblyTime, and FinishTime for each product and
present the coefficient estimates obtained in the table below.
Also present the R-squared values for the regressions and comment on the reliability of your coefficient
estimates.
Coefficient estimates S1 S2 S3 S4
Intercept ()
MACHINE TIME ()
ASSEMBLY TIME ()
FINISH TIME ()
5
Task 4. Estimate cost for producing each unit of product in a special run .
Substituting the estimates from tasks 2 and 3, we obtain the cost for producing each unit of product in a
special run as:
Present the estimates in the following format:
Product type S1 S2 S3 S4
Special production cost per unit ()
6
Task 5. SQL queries to extract data to predict demands
Load the data from demand.csv into the demand table
Formulate SQL query to obtain the demands for product 'S1' over the 40 periods:
select period, sum(sales) as S1demand
from demand
where productcode='S1'
group by period
order by period;
Similarly, formulate SQL queries to obtain the demands for the other three products 'over the 40 periods
Requirements for Business Intelligence Capstone Project.
A company produces four types of alarm systems – S1, S2, S3, and S4 – and supplies them to a
retailer. It is contractually obligated to meet the demands of the retailer for each alarm system.
Because of limited capacity the company may not have sufficient machining, assembly, and
finishing time available to satisfy the entire demand in each period through its regular
production runs. Contractual obligation requires the company to make up the shortfall in
production through special production runs at higher costs. The company aims to meet the
retailer's demands at minimum cost.
LP Formulation:
Task 1: (10 Points)
Formulate a linear programming (LP) model that may be solved to identify the optimal
production plan for the company in each time period.
Specifically, you must define the decision variables, objective function, and constraints in your
LP model using the following parameters:
In each time period, for each product :
is the demand (number of units required) for product .
is the cost (in dollars) for producing each unit of product in a regular run.
is the cost (in dollars) for producing each unit of product in a special run.
is the machining time (in minutes) required to produce each unit of product .
is the assembly time (in minutes) required to produce each unit of product .
is the finishing time (in minutes) required to produce each unit of product .
Further, assume that:
hours of machining time is available for regular run.
hours of assembly time is available for regular run.
hours of finishing time is available for regular run.
2
LP Parameter Estimation:
You must now use available data to estimate the parameters of the LP formulated in Task 1.
Estimation of , , , and :
The text file "production.csv" contains 7 columns: serialno, batchno, productcode, machinetime,
assemblytime, finishtime, and cost. Using any DBMS of your choice, create a table
PRODUCTION with serialno as its primary key and the 6 other columns as attributes and insert
the 24,000 records from production.csv into the table. Serialno is a unique identifier assigned to
each unit produced by the company; productcode specifies the product type; batchno identifies
the batch in which an item is produced (items are produced in batches of 10 units of a product
type); machinetime, assemblytime, and finishtime specify the time (in minutes) taken by each
process (machining, assembly, and finishing) to produce a unit; the last attribute, cost, specifies
the cost (in dollars) of producing the unit in a regular run.
Task 2: (10 Points)
Formulate an SQL query to obtain the average machining time, assembly time, finishing
time, and cost per unit for each product type as estimates of the parameters , , , and of the
LP model.
In your report, you must:
1. Specify your SQL query to obtain the estimates.
2. Specify your parameter estimates in the table below. Round all estimates to 1 decimal
place.
Estimates S1 S2 S3 S4
Machine Time ()
Assembly Time ()
Finish Time ()
Regular Cost ()
3
Estimation of special run cost :
It is known that the regular production cost is a linear function of the machining, assembly, and
finishing times for each product type. That is, , where is the fixed cost incurred to produce each
unit of , and , , and are respectively the costs per minute for machining, assembly, and finishing
each unit of product during regular run.
Task 3: (6 Points)
Run regressions to estimate the coefficients , , , and for each product .
In your report, please explain how you obtained the data for the 4 regressions to estimate the
coefficients. Then present your coeffient estimates in the table below. Round all estimates to 1
decimal place.
Coefficient estimates S1 S2 S3 S4
Intercept ()
MACHINE TIME ()
ASSEMBLY TIME ()
FINISH TIME ()
The fixed costs associated with the production of each unit of is the same under the regular and
the special run, but the special run costs per minute for machining, assembly, and finishing are
double the regular run costs.
Task 4: (4 Points)
Use the above relationship to estimate that the cost for producing each unit of product in a
special run as .
Present the estimates in the following format:
Product type S1 S2 S3 S4
Special production cost per unit ()
4
Estimation of demand
The text file "demand.csv" contains the retailer's sales data by region (Region1 and Region 2) for
the four alarm systems over the last 40 time periods. For example, the first row shows that 334
units of S1 were sold in the Region1 in time period 1, and the last row shows that 264 units of S4
were sold in Region2 in time period 40.
Create a table called DEMAND with a composite primary key made up of the attributes period,
productcode, and region. Sales is the fourth attribute in the DEMAND table. Insert all 320
records from demand.csv into the DEMAND table.
Task 5: (10 points)
Extract the data needed for predicting demand for S1 by formulating an SQL query that lists
the Period and the sum of the total sales for S1 from both regions in each of the 40 time
periods as S1demand. Similarly, formulate three more SQL queries to obtain the 40 records
for S2demand, S3demand, and S4demand.
Specify the SQL queries to obtain S1demand, S2demand, S3demand, and S4demand.
Task 6: (10 Points)
Use the results returned by the queries formulated in Task 5 in forecasting models to predict
the demands in time period 41 for each product.
You should consider various prediction and forecasting methods that you are familiar with. Use
the method that you think is most accurate in estimating demands. In your report, please present
the estimates for time period 41 in the following format:
Product type S1 S2 S3 S4
Demand () in period 41
5
Optimal LP Solution:
Task 7: (10 Points)
Solve the LP formulated in Task 1 using the parameters estimated in Tasks 2, 4, and 6 to
determine the optimal production plan for period 41.
Report the minimum production cost achievable, number of units of each product type to be
produced under the regular and special production runs, and the resources used during regular
run in the following format:
Minimum cost attainable:
Number of units
produced
S1 S2 S3 S4
Regular Run
Special Run
Resources in regular run Minutes used
MACHINE TIME
ASSEMBLY TIME
FINISH TIME
Sensitivity Analysis:
Task 8. (3+12 = 15 Points).
Perform sensitivity analysis by changing one parameter at a time (leaving all other
parameters fixed at the values used in Task 7) and answer the following questions.
(a) By how much does the total production cost change as the demand for each product
type changes by 1 unit?
(b) At most how much should the company be willing to pay to
(i) Increase the availability of machining time by one hour during regular run?
(ii) Increase the availability of finishing time by one hour during regular run?
(iii) Increase the availability of assembly time by one hour during regular run?
6
Quality Control
The text file "defective.csv" contains 2 columns. The first column defectiveid is an identifier,
and the second column serialno specifies the serial number of a defective product. Create a table
DEFECTIVE with defectiveid as its primary key and insert all 1639 records from defective.csv
into the table. Note that serialno in the DEFECTIVE table is a foreign key that references the
primary key in the PRODUCTION table.
The text file "quality.csv" contains 5 columns containing data from quality control tests run on
2400 batches of items produced. Create a table QUALITY with BatchNo as its primary key and
test1, test2, test3, and test4 as its other 4 attributes. Insert all 2400 records from quality.csv into
the table. Note that batchno in the PRODUCTION table is a foreign key that references the
primary key batchno in the Quality table.
Any batch that contains more than one defective items is deemed to be of poor quality; a batch
with at most one defective item is considered to be of good quality.
Task 9: (10 Points)
Formulate an SQL query that lists all 5 columns from the