Clustering is a core technique in unsupervised learning, used to group similar items without labelled examples. Many people start with k-means, but it comes with a practical limitation: you must choose the number of clusters in advance. Affinity Propagation offers a different approach. It forms clusters by exchanging “messages” between data points, and it can discover an appropriate number of clusters based on the data and the preferences you set. This makes it an interesting algorithm to learn and apply, especially in projects typically covered in a data science course in Pune.
The Key Idea: Exemplars Instead of Centroids
Affinity Propagation does not compute cluster centroids the way k-means does. Instead, it selects actual data points to act as exemplars—representative points that become the “centre” of each cluster.
The algorithm starts with:
- A similarity matrix that expresses how suitable one point is as an exemplar for another point (often negative squared Euclidean distance, but other similarity measures can be used).
- A preference value for each point that indicates how likely that point is to become an exemplar. Higher preference generally leads to more clusters.
Rather than iteratively moving centroids, Affinity Propagation iteratively updates messages that push the system toward a stable set of exemplars.
Understanding the “Message Passing” Mechanism
Affinity Propagation relies on two message types passed between points:
1) Responsibility
A responsibility message from point i to candidate exemplar k reflects how strongly i believe k should be its exemplar, compared with other possible exemplars. Intuitively, it answers: “Among all candidates, how good is k for i?”
2) Availability
An availability message from candidate exemplar k to point i reflects how appropriate it is for i to choose k as an exemplar, considering how many other points support k. Intuitively, it answers: “Is k willing and supported enough to be an exemplar for i?”
These messages are updated iteratively. Over time, points “agree” on which points are exemplars. Once the updates stabilize (or reach a max iteration count), clusters are formed around the chosen exemplars.
This message-based view is why the method is often described as similar in spirit to belief propagation in graphical models, though it is applied here to clustering.
Choosing Similarities and Preferences
Two inputs have an outsized impact on results: the similarity measure and the preference values.
Similarity measure
If you use Euclidean distance-based similarity, scaling matters. Features with large numeric ranges can dominate the similarity scores, so standardisation or normalisation is usually important.
For text or high-dimensional embeddings, cosine similarity can be more meaningful than Euclidean distance. In practical workflows taught in a data science course in Pune, this choice is often tied to the domain: customer segmentation, document grouping, image feature clustering, and so on.
Preference
Preferences influence the number of exemplars:
- Higher preferences typically yield more exemplars (more clusters).
- Lower preferences typically yield fewer exemplars (fewer clusters).
A common baseline is setting all preferences to the median similarity, then tuning based on how granular you want the clustering to be.
Strengths and Limitations in Practice
Where Affinity Propagation shines
- No need to predefine k: It can discover a cluster count driven by data and preference.
- Exemplars are real points: That can improve interpretability when you want a representative example rather than an abstract centroid.
- Works well with a good similarity metric: Especially when similarity is meaningful in your domain.
Common challenges
- Memory and compute cost: It needs a similarity matrix, which can be expensive for large datasets because it scales roughly with the square of the number of points.
- Sensitivity to preference: Poor preference choices can lead to too many or too few clusters.
- Convergence issues: Some datasets can cause oscillations; damping parameters are often used to stabilise updates.
Because of these trade-offs, Affinity Propagation is often used for small-to-medium datasets or as a method to explore structure before choosing a scalable clustering approach.
When to Use It
Affinity Propagation is a strong candidate when:
- You do not know the right number of clusters.
- You can define a meaningful similarity measure.
- Your dataset size makes a full similarity matrix feasible.
It is particularly useful in exploratory analysis where interpretability matters, such as finding representative customer profiles, selecting prototype examples in a dataset, or grouping similar documents with clear exemplars—use-cases frequently included in applied assignments within a data science course in Pune.
Conclusion
Affinity Propagation clusters data by allowing points to “vote” through message passing, ultimately selecting exemplars that represent each cluster. Its main advantage is that it does not require you to specify the number of clusters upfront, and it can produce interpretable results because examples are real data points. However, it can be computationally heavy due to the similarity matrix requirement and may require tuning preferences and damping for stable results. Used thoughtfully, it is a valuable clustering technique for exploratory analysis and prototype discovery in real-world datasets.



