I know there is a series of regression diagnostics procedures (correlation, beta, residual, etc.) before, during, and after regression analysis. But, is there any common procedure to follow for cluster analysis (like, Ward)? What are the R commands? Thanks!
Correlation can cause problems with many clustering algorithms by giving extra weight on these attributes. For k-means it seems to be a best practise to whiten the data first, for example.
However, there exist correlation clustering algorithms that are meant to process data containing multiple correlations, and cluster objects based on the correlations they exhibit.
- Solved – Clustering time series based on correlation
- Solved – Graph clustering algorithms which consider negative weights
- Solved – How to define silhouette for one cluster
- Solved – Residual analysis of cross-sectional time-series forecasts
- Solved – Two or more time series. What is the best way to test whether one of them is leading and by what time period