I'm a bit confused with concept of K-medoids.

It seems that original algorithm (PAM) describes that swap step should be performed by swaping only **one** of the medoids with one non-medoid point from the **whole dataset** (not only in cluster of that medoid), which will lead to smallest error in that moment.

First of all, is my understanding correct?

My original idea (which I implemented) swaps **all** of the medoids in one iteration, each medoid can be replaced by a point from **its cluster**, that leads to smallest error in its cluster. Is this a bad approach?

I was mislead by K-means, which calculates its centroid based only on the points from its cluster, but now I'm not sure if the same logic has sense for K-medoids.

Also, just to add, I implemented K-means++ like initialization, to optimize the procedure.

**Contents**hide

#### Best Answer

K-Medoids is the problem of finding a set of K representative objects of the dataset such to minimize the total dissimilarity of each object from its nearest representer.

The PAM is an approximate solution to this problem that falls in the category of local search algorithms. It is made of two phases:

the BUILD step in which an initial solution is built. It starts by selecting as the first medoid the object with the lowest mean dissimilarity w.r.t. the whole dataset (i.e., the most central object). Iteratively medoids are selected as the objects that further minimize the overall dissimilarity of each object from its nearest medoid.

the SWAP phase in which given the current solution (the set of k medoids) all the neighbour solutions are evaluated. A neighbour solution is constructed by swapping one of the current medoids with one of the non-selected objects. Thus the size of the neighborhood of a given solution is $(n-k)*k$ where $n$ is the total number of objecs of the dataset and $k$ is the number of wanted medoids.

Now, to answer your questions:

- Yes, your understanding is correct.
- You developed a k-means-like solution to the k-medoids problem. In your case the neighborhood of a given solution varies with the current clustering.

It may be better or worse than PAM, you may compare them evaluating:

- How much iterations they take to converge;
- The quality of the final solutions;
- How long it takes to complete an iteration;
- How well both scale with respect of the number of objects and of features in the dataset

If memory serves me right there should also be a paper around about a k-means like k-medoids algorithm.

### Similar Posts:

- Solved – Difference between K-medoids and PAM
- Solved – Difference between K-medoids and PAM
- Solved – When does pam (partition around medoids) fails to find the optimal solution? (counter example?!)
- Solved – Perform K-means (or its close kin) clustering with only a distance matrix, not points-by-features data
- Solved – How to derive the time computational complexity of k-medoids (PAM) clustering algorithm