Assignment title: Information
Your program should include at least the following core functionalities: 1. Uploading the data (including averaging of expression data). 2. Using two different distance metrics between data points: (a) Euclidean distance (b) Modified Pearson correlation coefficient 3. Presenting the results to the user both: 2 (a) as text files (list of data points in each cluster) (b) graphically: i. for visualizing clusters of gene expression data you should plot the gene profiles ii. for human hereditary disease data you should make two plots of the datapoints in 3D: A. a plot of the datapoints using a colouring that reflects the different disease types; B. a plot of the datapoints using a different colour for each of the clusters you obtain. 4. Presenting the user with the plot of the sum of squared distances (between each point and its cluster centre, summed over the clusters) as a function of the number of iterations.