Machine Learning

Unsupervised Learning

Unsupervised Learning

Unsupervised learning is the component of machine learning tackling unlabelled data. The process behind unsupervised machine learning uses algorithms to train on data and subsequently enable analysis and clustering of information. Unlabelled data refers to information without explicit tags (labels). This type of machine learning method operate by finding similarities as well as differences in data, and as such, discovers hidden patterns and grouping of information.

Types of Unsupervised Learning

There are three main types of unsupervised learning, clustering, association rules, and dimensionality reduction.

Clustering

Clustering algorithms aims to find the similarities and differences between datapoints and group them into clusters (groups). These algorithms discover hidden patterns in the data which may have not been spotted by a human. Common types of clustering algorithms are Probabilistic clustering, Hierarchical clustering, and K-means.

  • Example of clustering
    The objective of the model is to identify low-, medium-, and high-spenders (local retail firm). Input data of thousands of income, size, and occupation of families go through training, which in turn produces three separate clusters from the similarities and differences of families’ features.

Association Rules

Association rules discover relationships as well as patterns in the data. These techniques use rules-based methods to find the set of items occurring together within the datasets (usually implemented in large datasets) – i.e. frequency of itemsets. Common types of association rules are Elcat, Apriori, and FP-growth.

  • Example of association rules
    The model aims to recommend content, likely to enhance engagement experience with the specific user. Historical data of frequent patterns of the user (or similar user if past data unavailable) is processed and trained. The model learns the patterns and determines which particular content to recommend.

Dimensionality Reduction

Dimensionality reduction methods deal with reducing the number of features (dimensions) in the dataset. By implementing dimensionality reduction, i.e. transforming high-dimensional data into low-dimensional data, data become more manageable and performance-efficient. Common algorithms are autoencoders, Principal Component Analysis (PCA), and Singular Value Decomposition (SVD).

  • Example of dimensionality reduction
    The objective of the model is to predict whether a person should go to the beach. There are four inputs: sunny, snowing, raining, and thunderstorm. The person would only go to the beach if the weather is sunny, therefore, having three features for not going is unnecessary. Through dimensionally reduction, the model reduces the features to sunny and bad weather.

Applications of Unsupervised Learning

  • Object recognition in computer vision
  • Anomaly detection in machinery and fraud
  • Recommendation systems in ecommerce
  • Genetics and species clustering in biology
  • Audience segmentation in marketing and retail
  • Target advertisement content in advertising

Next: Reinforcement Learning