I am a biology student investigating a new method of creating a dichotomous identification key. I have created a dendrogram using data I have collected from a survey on how people rate how similar pictures of plant leaves are. I used ward's method to link the clusters. In the resulting dendrogram, I have a y-axis that ranges between 0 and about 50. I know that this axis represents at which the objects are joined in a cluster, thus how far they are from other objects, but I was wondering what exactly does the numeric value represent?
Best Answer
I'm going to, ahem, go out on a limb here, ahem, and guess that you built your tree via the hclust
function in base R with method = "ward.D2"
, which is Ward's original method. If you type ?hclust
and look for height
in the value
(output) section, it says "The clustering height: that is, the value of the criterion associated with the clustering method for the particular agglomeration." In this case, Ward's criterion is the total within-cluster error sum of squares, which increases as you go up the tree and make the clusters bigger.
Similar Posts:
- Solved – How to interpret the numeric values for “height” in a dendrogram using Ward’s clustering method
- Solved – How to interpret the numeric values for “height” in a dendrogram using Ward’s clustering method
- Solved – Cluster with distance threshold in R
- Solved – Extract (ultrametric) distances from hclust or dendrogram
- Solved – Clustering quality: normalized mean square error or absolute error