I am pretty new to clustering, so please be patient.
I have a set of points, and each point has a weight. I need to group these points into N clusters (N is defined).
I need these clusters to satisfy two conditions:
- The points of a cluster must be spatially connected.
- If a cluster has points with high weights, it must be smaller (less points). On the contrary, if the sum of the weights is small, it must have more points.
I have read another post that did something very similar.
It defined the distance between points, inserting the weight as a forth dimension. Then it defined several parameters to give more importance to the distance or to the weights. I cannot define these parameters (I think this would change for each example I try).
Also, this other post did not recommend any clustering algorithm…and I don't know where to start.
Thanks for the help!
PS: By the way, it would help if the algorithm is very fast.
Best Answer
For anybody who wants to know the answer, this is what I finally did:
I implemented a normal K-Means algorithm, but with some modifications:
The calculation of the centroid is site = Sum(p * weight^alpha) / Sum(weight^alpha) for all the points that belong to that site.
The calculation of the squared distance between point p and site s is squareDistance(p,s)*weight^alpha where alpha is some constant > 0.
The only problem is that my implementation is very slow 🙁
Similar Posts:
- Solved – How to define silhouette for one cluster
- Solved – Clustering (k-means, or otherwise) with a minimum cluster size constraint
- Solved – Cluster data points by distance between clusters
- Solved – Will the silhouette not approach 1 when increasing k to n
- Solved – Difference between Hartigan & Wong Algo to Lloyd’s algorithm in K-means clustering