Hierarchical Clustering - linkage(y)

Illustration
Salad Box - 2022-04-12T13:57:21+00:00
Question: Hierarchical Clustering - linkage(y)

Hi,   I don't fully understand the linkage function and its outputs.     >> x=[1 2 6 8]'; >> y=pdist(x) y = 1 5 7 4 6 2 >> Z=linkage(y) Z = 1 2 1 3 4 2 5 6 4 For Z, what does the first and second column stand for? What is the third column? How was the third column calculated based on the input data x and y?

Expert Answer

Profile picture of John Williams John Williams answered . 2025-11-20

Only just figured it out.
 
The first and second columns are the new group number after clustering.
 
4 elements in x each was given a group number from 1 to 4.
 
The system will then group element 1 and 2 in x together and give a new group number '5'.
 
The system will then group element 6 and 8 in x together and give a new group number '6'.
 
The system will at last group '5' and '6' together while group 5 looks like (1,2), and group 6 looks like (6,8).
 
the third column is the distance between the two group.
 
first number in column 3 is 1, that is the difference between elements 1 and 2.
 
second number in column 3 is 2, that is the difference between elements 6 and 8.
 
third number in column 3 is 4. That is the tricky one. group 5 and group 6 would look like (1,2) and (6,8). Matlab use the shortest euclidean distance by default. Therefore the shortest between two groups will be the 2 from group 5 and the 6 from group 6 with the distance 6-2=4.


Not satisfied with the answer ?? ASK NOW

Get a Free Consultation or a Sample Assignment Review!