Are cluster heads and cluster centroids the same in K-means clustering? If not, how to decide the cluster heads of each cluster?
Kshitij Singh answered .
2025-11-20
In the context of K-means clustering, cluster centroids and cluster heads often refer to different concepts, although they might be used interchangeably in some discussions.
Cluster Centroids: These are the central points of each cluster calculated as the mean of all the data points in the cluster. In K-means clustering, the goal is to minimize the distance of all points within a cluster to the centroid. Centroids are recalculated iteratively during the clustering process until the algorithm converges.
Cluster Heads: This term is more commonly used in network clustering or clustering-based routing algorithms (e.g., in wireless sensor networks), where it refers to a representative node or point within a cluster. Cluster heads are not necessarily the mean of the points but are chosen based on specific criteria, such as proximity, connectivity, or energy levels in the context of network clustering.
In traditional K-means clustering, we typically do not explicitly identify cluster heads, as the algorithm focuses on finding centroids. However, if you need to choose a representative point (or cluster head) for each cluster, you can follow these approaches:
Closest Point to Centroid: Choose the point within each cluster that is closest to the cluster centroid.
% Assuming 'idx' is the cluster index for each point and 'centroids' contains the cluster centroids
clusterHeads = zeros(size(centroids));
for k = 1:numClusters
clusterPoints = X(idx == k, :);
distances = sum((clusterPoints - centroids(k, :)).^2, 2);
[~, minIdx] = min(distances);
clusterHeads(k, :) = clusterPoints(minIdx, :);
end
Median Point: Choose the median point within each cluster, which can be more robust to outliers than the mean.
clusterHeads = zeros(size(centroids));
for k = 1:numClusters
clusterPoints = X(idx == k, :);
clusterHeads(k, :) = median(clusterPoints, 1);
end
Here's an example of how you might implement the first approach in MATLAB:
% Assuming 'X' is your data matrix, 'idx' is the cluster index for each point, and 'centroids' contains the cluster centroids
numClusters = size(centroids, 1);
clusterHeads = zeros(numClusters, size(X, 2));
for k = 1:numClusters
clusterPoints = X(idx == k, :);
distances = sum((clusterPoints - centroids(k, :)).^2, 2);
[~, minIdx] = min(distances);
clusterHeads(k, :) = clusterPoints(minIdx, :);
end
disp('Cluster Heads:');
disp(clusterHeads);