Hierarchical clustering

From WikiMD's medical encyclopedia

Clusters

Hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types: *Agglomerative*: This is a "bottom-up" approach where each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. *Divisive*: This is a "top-down" approach where all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy. The results of hierarchical clustering are usually presented in a dendrogram.

Overview

In the field of data analysis, hierarchical clustering is a powerful tool that allows the analyst to identify the natural groupings or structures within a dataset. Unlike k-means clustering, which requires the analyst to specify the number of clusters beforehand, hierarchical clustering does not require the number of clusters to be specified in advance, making it particularly useful for exploratory data analysis.

Algorithm

The algorithm for hierarchical clustering can be described as follows:

Agglomerative Clustering

  1. Start by treating each data point as a single cluster.
  2. Find the closest (most similar) pair of clusters and merge them into a single cluster.
  3. Compute distances (similarities) between the new cluster and each of the old clusters.
  4. Repeat steps 2 and 3 until all items are clustered into a single cluster of size n.

Divisive Clustering

  1. Start with all observations in a single cluster.
  2. Find the cluster to split and how to split it.
  3. Perform the split to create two new clusters.
  4. Repeat steps 2 and 3 until each observation is in its own cluster.

Distance Measures

The choice of distance measure is a critical step in clustering. It defines how the similarity of two elements is calculated and it will influence the shape of the clusters. The most common distance measures used in hierarchical clustering are:

  • Euclidean distance: The standard distance measure also known as straight-line distance.
  • Manhattan distance: Sum of the absolute differences of their Cartesian coordinates also known as city block distance.
  • Cosine similarity: Measures the cosine of the angle between two vectors.

Applications

Hierarchical clustering is widely used in various fields such as:

Advantages and Disadvantages

Advantages

  • Does not require the number of clusters to be specified in advance.
  • Easy to implement and provides hierarchical relationships among the observations.

Disadvantages

  • Can be computationally expensive, especially for large datasets.
  • The results can be sensitive to the choice of distance measure and linkage criteria.

See Also

This article is a stub.

You can help WikiMD by registering to expand it.
Editing is available only to registered and verified users.
WikiMD is a comprehensive, free health & wellness encyclopedia.

Navigation: Wellness - Encyclopedia - Health topics - Disease Index‏‎ - Drugs - World Directory - Gray's Anatomy - Keto diet - Recipes

Transform your life with W8MD's budget GLP-1 injections from $125.

W8mdlogo.png
W8MD weight loss doctors team

W8MD offers a medical weight loss program to lose weight in Philadelphia. Our physician-supervised medical weight loss provides:

NYC weight loss doctor appointments

Start your NYC weight loss journey today at our NYC medical weight loss and Philadelphia medical weight loss clinics.

Linkedin_Shiny_Icon Facebook_Shiny_Icon YouTube_icon_(2011-2013) Google plus


Advertise on WikiMD

WikiMD's Wellness Encyclopedia

Let Food Be Thy Medicine
Medicine Thy Food - Hippocrates

Medical Disclaimer: WikiMD is not a substitute for professional medical advice. The information on WikiMD is provided as an information resource only, may be incorrect, outdated or misleading, and is not to be used or relied on for any diagnostic or treatment purposes. Please consult your health care provider before making any healthcare decisions or for guidance about a specific medical condition. WikiMD expressly disclaims responsibility, and shall have no liability, for any damages, loss, injury, or liability whatsoever suffered as a result of your reliance on the information contained in this site. By visiting this site you agree to the foregoing terms and conditions, which may from time to time be changed or supplemented by WikiMD. If you do not agree to the foregoing terms and conditions, you should not enter or use this site. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates, categories Wikipedia, licensed under CC BY SA or similar.

Contributors: Prab R. Tumpati, MD