Assignment title: Information


CSI6110 Software Development Processes CSI6110 Assignment Page 1 of 7 Mike Johnstone – 10/10/2016 ASSIGNMENT 2 – INDIVIDUAL AND GROUP ASSESSMENT Assignment value: 35% Due: Friday, 28th of October 2016, at 11:59pm, WST. Related learning outcomes from the unit outline: • Apply techniques to estimate software project costs, duration and effort. • Identify software capability determination and evaluation techniques. • Reflect on the need for measurement and software metric formulation and evaluation Background: The Personal Software Process (PSP), as espoused by Watts Humphrey, attempts to focus on improving individual performance in the upper levels of the CMM. The PSP recommends establishing metrics to measure aspects of the software process and to test personal competency by writing programs measured with these metrics. You are required to create two small programs in Python. Similar to Assignment#1, as part of the exercise you need to estimate how long you think it will take you to write the program (after reading the brief). You must then keep accurate records of two things: 1) how it takes you to write the program; and 2) any defects (bugs) that you find and fix as part of the development. For each of tasks #1 and #2, make your best estimate of 1) Program size (i.e. how many lines of code you think you will write); and 2) Time to complete (i.e. the time it will take you to write the program in hours or days, including time for defect detection/removal). Note: It is essential that the estimates are calculated before you start writing your programs. Task#1 (individual-5%): Write a program to count (in LOC) the total program size of a Python program and print out a count of the LOC. Consider whether a comment counts as a "line". Justify your answer in your submission. Task#2 (individual-15%): Read Appendix#1 for details on the programming task, the tests required and some background on the techniques used (if you are not familiar with them). Task#3: (group-10%): In your group (assigned by your lecturer), pool your values for time to complete (in person-days, for example and lines of code) for tasks#1 and #2. Many predictive models, such as COCOMO, are of the general form: Person-Months = a*KLOC^b where "a" and "b" are defined constants and "*" and "^" are the usual multiplication and exponentiation operators (in your model, you would replace person-months with person-days or hours and KLOC with LOC as your programs are small). When using this type of model, two questions arise. First, where do these constants come from and second, could this model be used as-is or would it need to be calibrated for each organisation?CSI6110 Software Development Processes CSI6110 Assignment Page 2 of 7 Mike Johnstone – 10/10/2016 The constants come from fitting a curve (the above equation) to known data about projects (their KLOC and PM results). Calibration is essential as the type of projects that were used to derive the original values for the modes of COCOMO are not necessarily in the same domain as the software development firm which wishes to use the model. A common method used to fit curves to data is least-squares regression. For a linear model, the equation is often expressed as: y = β1*x + β0, where "β1" is the slope of the line and "β0" is the point where the line crosses the y-axis (called the y-intercept). Calculate constants for a predictive model, based on your pooled data. The COCOMO equation will need to be transformed into a linear equation to use the linear model above. Use: Log Person-days = Log a + b * Log LOC Task#4 (individual-5%): Having determined a model that predicts time from LOC, use your actual value for LOC from Assignment#1 to estimate the time to complete Assignment#1. Is this estimate similar to the actual time taken in Assignment#1? Discuss the difference or similarity. Task#5: When complete submit your task deliverables, programs, any snapshots and time sheets via BlackBoard (MyECU). If you are not sure about any of the requirements for this assignment, please check with your lecturer. Important: By providing a submission, you are declaring that the submission is entirely your own work, except where reference is made to the work of others (using the ECU referencing standard, available at: http://www.ecu.edu.au/library/pdf/refguide.pdf). If this is found not to be correct you will be subject to penalties ranging from loss of marks to exclusion from the University.CSI6110 Software Development Processes CSI6110 Assignment Page 3 of 7 Mike Johnstone – 10/10/2016 Appendix 1 Note: This information is copyright © Carnegie Mellon University. Used with permission. Program 2 requirements Write a program to: • calculate the linear regression parameters β 0 and β1 and correlation coefficients rx, y and r 2 for a set of n pairs of data, • given an estimate, xk calculate an improved prediction, yk where yk = β0 + β1xk Table 1 contains historical estimated and actual data for 10 programs. For program 11, the developer has estimated a proxy size of 386 LOC. Thoroughly test the program. At a minimum, run the following four test cases. • Test 1: Calculate the regression parameters and correlation coefficients between estimated proxy size and actual added and modified size in Table 1. Calculate plan added and modified size given an estimated proxy size of xk = 386. • Test 2: Calculate the regression parameters and correlation coefficients between estimated proxy size and actual development time in Table 1. Calculate time estimate given an estimated proxy size of xk = 386. • Test 3: Calculate the regression parameters and correlation coefficients between plan added and modified size and actual added and modified size in Table 1. Calculate plan added and modified size given an estimated proxy size of xk = 386. • Test 4: Calculate the regression parameters and correlation coefficients between plan added and modified size and actual development time in Table 1. Calculate time estimate given an estimated proxy size of xk = 386. Expected results are provided in Table 2. Program Number Estimated Proxy Size Plan Added and Modified size Actual Added and Modified Size Actual Development Hours 1 130 163 186 15.0 2 650 765 699 69.9 3 99 141 132 6.5 4 150 166 272 22.4 5 128 137 291 28.4 6 302 355 331 65.9 7 95 136 199 19.4 8 945 1206 1890 198.7 9 368 433 788 38.8 10 961 1130 1601 138.2 Table 1CSI6110 Software Development Processes CSI6110 Assignment Page 4 of 7 Mike Johnstone – 10/10/2016 Expected results Test Expected Values Actual Values β 0 β1 rx, y r 2 yk β 0 β1 rx, y r 2 yk Test 1 -22.55 1.7279 0.9545 0.9111 644.429 Test 2 -4.039 0.1681 0.9333 .8711 60.858 Test 3 -23.92 1.43097 .9631 .9276 528.4294 Test 4 -4.604 0.140164 .9480 .8988 49.4994 Table 2 Regression Overview Linear regression is a way of optimally fitting a line to a set of data. The linear regression line is the line where the distance from all points to that line is minimised. The equation of a line can be written as y = β0 + β1x In Figure 1, the best fit regression line has parameters of β 0 = -4.0389 and β1 = 0.1681. Figure 1 y = -4.0389 +0.1681x 0 50 100 150 200 250 0 200 400 600 800 1000 1200 Estimated Proxy Size Actual Development HoursCSI6110 Software Development Processes CSI6110 Assignment Page 5 of 7 Mike Johnstone – 10/10/2016 Correlation Overview The correlation calculation determines the relationship between two sets of numerical data. The correlation rx, y can range from +1 to -1. • Results near +1 imply a strong positive relationship; when x increases, so does y. • Results near -1 imply a strong negative relationship; when x increases, y decreases. • Results near 0 imply no relationship. Calculating regression and correlation Calculating regression and correlation The formulas for calculating the regression parameters β 0 and β1 are ( ) ( 2 ) 1 2 1 1 avg n i i avg avg n i i i x nx x y nx y  −     −    = ∑ ∑ = β = β0 = yavg − β1xavg The formulas for calculating the correlation coefficient rx, y and r 2 are      −         −         −    = ∑ ∑ ∑ ∑ ∑ ∑ ∑ = = = = = = = 2 1 1 2 2 1 1 2 1 1 1 , n i i n i i n i i n i i n i i n i i n i i i x y n x x n y y n x y x y r r 2 = r * r where • Σ is the symbol for summation • i is an index to the n numbers • x and y are the two paired sets of data • n is the number of items in each set x and y • xavg is the average of the x values • yavg is the average of the y valuesCSI6110 Software Development Processes CSI6110 Assignment Page 6 of 7 Mike Johnstone – 10/10/2016 Appendix 2 - PSP Time Recording Log Student Date Start Date and Time Stop Date and Time Delta Time CommentsCSI6110 Software Development Processes CSI6110 Assignment Page 7 of 7 Mike Johnstone – 10/10/2016 Appendix 3 – Defect Types PSP Defect Type Standard Type Number Type Name Description 10 Documentation Comments, messages 20 Syntax Spelling, punctuation, typos, instruction formats 30 Build, Package Change management, library, version control 40 Assignment Declaration, duplicate names, scope, limits 50 Interface Procedure calls and references, I/O, user formats 60 Checking Error messages, inadequate checks 70 Data Structure, content 80 Function Logic, pointers, loops, recursion, computation, function defects 90 System Configuration, timing, memory 100 Environment Design, compile, test, or other support system problems