Hierarchical clustering
Hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types: *Agglomerative*: This is a "bottom-up" approach where each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. *Divisive*: This is a "top-down" approach where all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy. The results of hierarchical clustering are usually presented in a dendrogram.
Overview
In the field of data analysis, hierarchical clustering is a powerful tool that allows the analyst to identify the natural groupings or structures within a dataset. Unlike k-means clustering, which requires the analyst to specify the number of clusters beforehand, hierarchical clustering does not require the number of clusters to be specified in advance, making it particularly useful for exploratory data analysis.
Algorithm
The algorithm for hierarchical clustering can be described as follows:
Agglomerative Clustering
- Start by treating each data point as a single cluster.
- Find the closest (most similar) pair of clusters and merge them into a single cluster.
- Compute distances (similarities) between the new cluster and each of the old clusters.
- Repeat steps 2 and 3 until all items are clustered into a single cluster of size n.
Divisive Clustering
- Start with all observations in a single cluster.
- Find the cluster to split and how to split it.
- Perform the split to create two new clusters.
- Repeat steps 2 and 3 until each observation is in its own cluster.
Distance Measures
The choice of distance measure is a critical step in clustering. It defines how the similarity of two elements is calculated and it will influence the shape of the clusters. The most common distance measures used in hierarchical clustering are:
- Euclidean distance: The standard distance measure also known as straight-line distance.
- Manhattan distance: Sum of the absolute differences of their Cartesian coordinates also known as city block distance.
- Cosine similarity: Measures the cosine of the angle between two vectors.
Applications
Hierarchical clustering is widely used in various fields such as:
- Biology, for constructing phylogenetic trees.
- Information retrieval, for document clustering.
- Social sciences, for clustering individuals based on their characteristics.
- Market research, for customer segmentation.
Advantages and Disadvantages
Advantages
- Does not require the number of clusters to be specified in advance.
- Easy to implement and provides hierarchical relationships among the observations.
Disadvantages
- Can be computationally expensive, especially for large datasets.
- The results can be sensitive to the choice of distance measure and linkage criteria.
See Also
This article is a stub. You can help WikiMD by registering to expand it. |
Transform your life with W8MD's budget GLP-1 injections from $125.
W8MD offers a medical weight loss program to lose weight in Philadelphia. Our physician-supervised medical weight loss provides:
- Most insurances accepted or discounted self-pay rates. We will obtain insurance prior authorizations if needed.
- Generic GLP1 weight loss injections from $125 for the starting dose.
- Also offer prescription weight loss medications including Phentermine, Qsymia, Diethylpropion, Contrave etc.
NYC weight loss doctor appointments
Start your NYC weight loss journey today at our NYC medical weight loss and Philadelphia medical weight loss clinics.
- Call 718-946-5500 to lose weight in NYC or for medical weight loss in Philadelphia 215-676-2334.
- Tags:NYC medical weight loss, Philadelphia lose weight Zepbound NYC, Budget GLP1 weight loss injections, Wegovy Philadelphia, Wegovy NYC, Philadelphia medical weight loss, Brookly weight loss and Wegovy NYC
WikiMD's Wellness Encyclopedia |
Let Food Be Thy Medicine Medicine Thy Food - Hippocrates |
Medical Disclaimer: WikiMD is not a substitute for professional medical advice. The information on WikiMD is provided as an information resource only, may be incorrect, outdated or misleading, and is not to be used or relied on for any diagnostic or treatment purposes. Please consult your health care provider before making any healthcare decisions or for guidance about a specific medical condition. WikiMD expressly disclaims responsibility, and shall have no liability, for any damages, loss, injury, or liability whatsoever suffered as a result of your reliance on the information contained in this site. By visiting this site you agree to the foregoing terms and conditions, which may from time to time be changed or supplemented by WikiMD. If you do not agree to the foregoing terms and conditions, you should not enter or use this site. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates, categories Wikipedia, licensed under CC BY SA or similar.
Contributors: Prab R. Tumpati, MD