Unsupervised learning: Difference between revisions

Revision as of 00:42, 10 February 2025

Type of machine learning algorithm

Unsupervised learning is a type of machine learning that involves training a model on data without explicit instructions on what to do with it. The model attempts to learn the underlying structure of the data by identifying patterns and relationships. Unlike supervised learning, where the model is trained on labeled data, unsupervised learning works with unlabeled data.

Overview

Unsupervised learning is used to draw inferences from datasets consisting of input data without labeled responses. The most common unsupervised learning tasks are clustering and association.

Clustering: This involves grouping a set of objects in such a way that objects in the same group (or cluster) are more similar to each other than to those in other groups. K-means clustering and hierarchical clustering are popular clustering algorithms.

Association: This involves discovering interesting relations between variables in large databases. A common example is market basket analysis, which is used to identify sets of products that frequently co-occur in transactions.

Techniques

Several techniques are used in unsupervised learning, including:

Neural networks: These are computational models inspired by the human brain, consisting of interconnected groups of artificial neurons. They are used in various unsupervised learning tasks.

Dimensionality reduction: Techniques such as Principal Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE) are used to reduce the number of random variables under consideration.

Anomaly detection: This involves identifying rare items, events, or observations that raise suspicions by differing significantly from the majority of the data.

Applications

Unsupervised learning is applied in various fields, including:

Image recognition: Identifying patterns and features in images without prior labeling.

Genomics: Analyzing genetic data to find patterns and relationships.

Natural language processing: Understanding and processing human language data.

Related pages

Gallery

References

Ian,

 Deep Learning, 
  
 MIT Press, 
 2016, 
  
  
 ISBN 978-0262035613,

LeCun, Yann,

 Unsupervised Learning: Foundations of Neural Computation, 
 MIT Press, 
 1998,

@@ Line 1: / Line 1: @@
-'''Unsupervised learning''' is a type of [[machine learning]] that uses [[algorithm]]s to analyze and cluster unlabeled datasets. These algorithms discover hidden patterns or data groupings without the need for human intervention.
+{{Short description|Type of machine learning algorithm}}
+{{Machine learning}}
-== Overview ==
+'''Unsupervised learning''' is a type of [[machine learning]] that involves training a model on data without explicit instructions on what to do with it. The model attempts to learn the underlying structure of the data by identifying patterns and relationships. Unlike [[supervised learning]], where the model is trained on labeled data, unsupervised learning works with unlabeled data.
-Unsupervised learning is a type of machine learning that trains itself using data that has not been classified, labeled or categorized. Instead of responding to feedback, unsupervised learning identifies commonalities in the data and reacts based on the presence or absence of such commonalities in each new piece of data.
+==Overview==
+Unsupervised learning is used to draw inferences from datasets consisting of input data without labeled responses. The most common unsupervised learning tasks are [[clustering]] and [[association]].
-== Types of Unsupervised Learning ==
+* '''Clustering''': This involves grouping a set of objects in such a way that objects in the same group (or cluster) are more similar to each other than to those in other groups. [[K-means clustering]] and [[hierarchical clustering]] are popular clustering algorithms.
-There are two main methods of unsupervised learning: [[Cluster analysis|clustering]] and [[Dimensionality reduction|dimensionality reduction]].
+* '''Association''': This involves discovering interesting relations between variables in large databases. A common example is [[market basket analysis]], which is used to identify sets of products that frequently co-occur in transactions.
-=== Clustering ===
+==Techniques==
+Several techniques are used in unsupervised learning, including:
-[[Clustering]] is a method of unsupervised learning where the model discovers and analyzes a dataset to group (or cluster) them into clusters based on similarity or common patterns. The most common clustering algorithms include [[K-means clustering|K-means]], [[Hierarchical clustering|hierarchical]], and [[DBSCAN]].
+* '''[[Neural networks]]''': These are computational models inspired by the human brain, consisting of interconnected groups of artificial neurons. They are used in various unsupervised learning tasks.
-=== Dimensionality Reduction ===
+* '''[[Dimensionality reduction]]''': Techniques such as [[Principal Component Analysis]] (PCA) and [[t-distributed Stochastic Neighbor Embedding]] (t-SNE) are used to reduce the number of random variables under consideration.
-[[Dimensionality reduction]] is a method that reduces the number of random variables under consideration by obtaining a set of principal variables. It is used to simplify data processing without losing much information. Common dimensionality reduction algorithms include [[Principal Component Analysis|Principal Component Analysis (PCA)]] and [[Autoencoder|autoencoders]].
+* '''[[Anomaly detection]]''': This involves identifying rare items, events, or observations that raise suspicions by differing significantly from the majority of the data.
-== Applications of Unsupervised Learning ==
+==Applications==
+Unsupervised learning is applied in various fields, including:
-Unsupervised learning has numerous applications, including:
+* '''[[Image recognition]]''': Identifying patterns and features in images without prior labeling.
-* '''[[Anomaly detection]]''': Unsupervised learning can be used to identify unusual data points in your dataset. This is useful in many domains, such as fraud detection, fault detection, and system health monitoring.
+* '''[[Genomics]]''': Analyzing genetic data to find patterns and relationships.
-* '''[[Association rule]] learning''': This is a method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using measures of interestingness.
-* '''[[Natural language processing]]''': Unsupervised learning is used in natural language processing to extract statistically relevant patterns in data, which are then used to understand natural language.
-== See Also ==
+* '''[[Natural language processing]]''': Understanding and processing human language data.
+==Related pages==
 * [[Supervised learning]]
 * [[Reinforcement learning]]
-* [[Semi-supervised learning]]
 * [[Deep learning]]
-== References ==
+==Gallery==
+[[File:Task-guidance.png|thumb|Task guidance in unsupervised learning]]
+[[File:Hopfield-net-vector.svg|thumb|Hopfield network]]
+[[File:Boltzmannexamplev1.png|thumb|Boltzmann machine example]]
+[[File:Restricted_Boltzmann_machine.svg|thumb|Restricted Boltzmann machine]]
+[[File:Stacked-boltzmann.png|thumb|Stacked Boltzmann machine]]
+[[File:Helmholtz_Machine.png|thumb|Helmholtz machine]]
+[[File:Autoencoder_schema.png|thumb|Autoencoder schema]]
+[[File:VAE_blocks.png|thumb|Variational autoencoder blocks]]
-<references />
+==References==
+* {{Cite book |last=Goodfellow |first=Ian |author-link=Ian Goodfellow |title=Deep Learning |year=2016 |publisher=MIT Press |isbn=978-0262035613}}
+* {{Cite journal |last=LeCun |first=Yann |author-link=Yann LeCun |title=Unsupervised Learning: Foundations of Neural Computation |journal=MIT Press |year=1998}}
-[[Category:Machine Learning]]
+[[Category:Machine learning]]
-[[Category:Artificial Intelligence]]
-[[Category:Data Mining]]
-[[Category:Computer Science]]
-{{Machine learning bar}}
-{{Artificial intelligence}}
-{{Computer science}}
-{{Data mining}}
-{{stub|computer}}