Supervised vs Unsupervised Learning

Authors
  • Amit Shekhar
    Name
    Amit Shekhar
    Published on
Supervised vs Unsupervised Learning

I am Amit Shekhar, Founder @ Outcome School, I have taught and mentored many developers, and their efforts landed them high-paying tech jobs, helped many tech companies in solving their unique problems, and created many open-source libraries being used by top companies. I am passionate about sharing knowledge through open-source, blogs, and videos.

Join Outcome School and get high paying tech job: Outcome School

Before we begin, we’d like to mention that we’ve launched our YouTube channel. Subscribe to the Outcome School YouTube Channel.

In this blog, we will learn about the Supervised vs Unsupervised Learning in Machine Learning.

Prerequisite: What is Machine Learning?

As we have learned, machine learning is all about enabling computers to learn patterns from data. But the way they learn depends on whether the data comes with labels (answers) or not. This brings us to two major learning categories:

  • Supervised Learning
  • Unsupervised Learning

Both solve completely different kinds of problems, and understanding the difference is essential before you start building any ML model.

Let’s begin.

Supervised Learning

Supervised learning is like learning with a teacher.

You can also think of it this way: You’re solving math problems with the answer key next to you. You learn by comparing your answers to the correct ones.

You provide the model with:

  • Input data
  • Correct output labels

The goal is for the model to learn the mapping between inputs and outputs so that it can make predictions on new, unseen data.

Let's look at a few examples of supervised learning:

  • Predicting house prices
  • Classifying emails as spam or not spam
  • Recognizing digits from images
  • Predicting whether a customer will churn

How does it work?

  • Feed the model data + correct labels (answers).
  • Model makes predictions.
  • Compare predictions with the correct labels (answers).
  • Adjust weights using an optimization algorithm.
  • Repeat until error is minimized.

Types of Supervised Learning

  • Classification: Categorizing data into predefined classes (spam detection, image recognition)
  • Regression: Predicting continuous numerical values (house prices, stock prices, temperature forecasting)

Now, let's look at the example dataset for supervised learning.

A simple example dataset used for house price prediction: This dataset contains features(Area, Bedrooms, Location Score) + label(Price), needed for supervised learning.

Area (sq ft)BedroomsLocation ScorePrice (₹ Lakhs)
120027.585
150038.0110
90026.870
180038.5130
110027.078

Here, Price is the label (the output variable). The model learns to predict price based on the features.

So, when you provide new unseen input data(area, number of bedrooms, and location score), the model can predict the price.

Unsupervised Learning

Unsupervised learning is like learning without a teacher.

You can also think of it this way: You’re exploring a new city without a map. You observe patterns, such as busy areas and quiet areas, even though no one tells you which is which.

Here, the dataset contains only the input data, no labels, no predefined outputs.

The goal is to uncover hidden patterns, structures, or groupings within the data.

Let's look at a few examples of unsupervised learning:

  • Grouping customers into segments
  • Finding patterns in website user behavior
  • Detecting anomalies (fraud, unusual transactions)
  • Reducing dimensions (PCA - Principal Component Analysis)

How does it work?

  • Feed the model only raw data.
  • Model analyzes the structure.
  • It tries to group, compress, or find relationships without external guidance.

Types of Unsupervised Learning

  • Clustering: Grouping similar data points together (K-Means).
  • Dimensionality Reduction: Compressing data while preserving important information (PCA - Principal Component Analysis).

Now, let's look at the example dataset for unsupervised learning.

An example dataset used for fitness app user segmentation: Unlabeled Raw Fitness App User Data

A fitness app wants to understand user behavior and identify different types of users even though no labels exist.

User IDDaily StepsActive MinutesWeekly WorkoutsAvg Heart RateSleep HoursWater Intake (L)App Engagement Score
112,500755727.52.888
29,200503786.82.175
34,800181856.01.552
415,000926688.03.293
58,000402806.51.967
613,400804747.22.685
75,600251826.21.758
810,500603767.02.378

There is no label here. An algorithm like K-Means will try to group customers into clusters based on these features.

After applying K-Means, the model finds natural patterns that map to real-world fitness personas.

User IDDaily StepsActive MinutesWeekly WorkoutsAvg Heart RateSleep HoursWater Intake (L)App Engagement ScoreSegment
112,500755727.52.888Active Lifestyle Enthusiasts
29,200503786.82.175Moderate Fitness Users
34,800181856.01.552Low Activity Busy Users
415,000926688.03.293Active Lifestyle Enthusiasts
58,000402806.51.967Moderate Fitness Users
613,400804747.22.685Active Lifestyle Enthusiasts
75,600251826.21.758Low Activity Busy Users
810,500603767.02.378Moderate Fitness Users

Without any labels, the model still identifies meaningful clusters such as Active Lifestyle Enthusiasts, Moderate Fitness Users, and Low Activity Busy Users.

Differences Between Supervised and Unsupervised Learning

Let's summarize the differences in a tabular format.

Supervised LearningUnsupervised Learning
Labeled DataUnlabeled Data
Predicts output label.Discovers hidden patterns, groups similar data, or reduces data dimensionality.
Types: Classification, RegressionTypes: Clustering, Dimensionality Reduction
Examples: Spam detection, price predictionExamples: Customer segmentation, anomaly detection
Algorithms: Linear Regression, Logistic Regression, Decision Trees, Random ForestsAlgorithms: K-Means Clustering, Hierarchical Clustering, Principal Component Analysis (PCA)
Use it when you know the target variable, you want predictions, and you have labeled data or can generate labels.Use it when you want to explore the data, labels are not available, and you need natural groupings or structure.

Both supervised and unsupervised learning are essential parts of machine learning. Supervised learning helps you predict, while unsupervised learning helps you discover.

Prepare yourself for Machine Learning Interview: Machine Learning Interview Questions

That's it for now.

Thanks

Amit Shekhar
Founder @ Outcome School

You can connect with me on:

Follow Outcome School on:

Read all of our high-quality blogs here.