Home About Projects Blog Graph Resume Contact 中文
Back to list

2026年4月8日

K-Medians

What is K-Medians and the difference comparing to K-Means

FIT3152Class NoteEnglish

K-medians

K-medians is similar to K-means, but the objective function is different.

K-means tries to minimise the total squared distance from each point to its centroid.

K-medians tries to minimise the total absolute distance from each point to its cluster centre.

For K-means:

i=1kj=1nid(ci,xi,j)2\sum_{i=1}^{k}\sum_{j=1}^{n_i} d(c_i,x_{i,j})^2

The best centre for each cluster is the mean.

For K-medians:

i=1kj=1nid(ci,xi,j)\sum_{i=1}^{k}\sum_{j=1}^{n_i} d(c_i,x_{i,j})

The best centre is based on the median, not the mean.

Difference

K-means is more sensitive to outliers because squared distance gives large penalty to points that are far away.

K-medians is usually more robust to outliers because it uses absolute distance instead of squared distance.

Important Idea

If the question asks about K-means objective, we should write total squared distance.

If we write total distance instead of total squared distance, then it is no longer standard K-means.

It becomes closer to K-medians or geometric median based clustering.

Why Mean and Median

The reason is related to what each method tries to minimise.

K-means minimises squared distance:

j(xjc)2\sum_j (x_j-c)^2

The value of cc that minimises this is the mean.

So the centre of the cluster is calculated:

c=1nj=1nxjc=\frac{1}{n}\sum_{j=1}^{n}x_j

That is why it is called K-means.

K-medians minimises absolute distance:

jxjc\sum_j |x_j-c|

The value of cc that minimises this is the median.

So the centre of the cluster is based on the middle value, not the average value.

That is why it is called K-medians.

Backlinks