The soft k-means clustering algorithm starts from randomly chosen centers and iterates the following two steps:

Centers to Soft Clusters (E-step): After centers have been selected, assign each data point a “responsibility” value for each cluster, where higher values correspond to stronger cluster membership.

Soft Clusters to Centers (M-step): After data points have been assigned to soft clusters, compute new centers.

We begin with the “Centers to Soft Clusters” step. If we think about the centers as stars and the data points as planets, then the closer a point is to a center, the stronger that center’s “pull” should be on the point. Given k centers Centers = (x_{1}, ..., x_{k}) and n points Data = (Data_{1}, ... , Data_{n}), we therefore need to construct a k × n responsibility matrix HiddenMatrix for which HiddenMatrix_{i,j is the pull of center i on data point j. This pull can be computed according to the Newtonian inverse-square law of gravitation,}

In this formula, e is the base of the natural logarithm (e ≈ 2.72), and β is a parameter reflecting the amount of flexibility in our soft assignment and called — appropriately enough — the stiffness parameter.

In soft k-means clustering, if we let HiddenMatrix_{i}denote the i-th row of HiddenMatrix, then we can update center x_{i}using an analogue of the above formulas. Specifically, we will define the j-th coordinate of center x_{i}, denoted x_{i, j}, as

Here, Data^{j} is the n-dimensional vector holding the j-th coordinates of the n points in Data.

The updated center x_{i}is called a weighted center of gravity of the points Data.

Implement the Soft k-Means Clustering Algorithm

Given: Integers k and m, followed by a stiffness parameter β, followed by a set of points Data in m-dimensional space.

Return: A set Centers consisting of k points (centers) resulting from applying the soft k-means clustering algorithm. Select the first k points from Data as the first centers for the algorithm and run the algorithm for 100 steps. Results should be accurate up to three decimal places.