Lecture 3-1: Machine Learning - The Complete Guide 🤖

1. The Hype: AI vs ML vs DL 🤔

Slides 2-7

The Hierarchy 🧅

AI: Mimicking human behavior (1950s+).
Machine Learning (ML): Learning from data without explicit programming (1980s+).
Deep Learning (DL): Neural Networks with many layers (2010s+).

🧠 ➡️ 💻

"Machine learning gives computers the ability to learn without being explicitly programmed."

2. Loading MNIST Data 👋

Slides 8-13

Before we analyze, we must load the data. We use the famous MNIST dataset of handwritten digits.

Initializing Pyodide... Please run this block first to load the 11MB dataset. It might take a few seconds.

Visualizing a Single Digit

Let's plot the first sample to verify it's a digit.

Separating 0 and 1

We filter the dataset to keep only zeros and ones.

3. Feature Extraction & ROC 📉

Slides 14-23

Raw pixels are hard to classify directly. Let's create features!
Idea: Zeros have a hole in the middle. Ones are solid.

Calculating mean intensities...

Calculating AUC (Area Under Curve)

The AUC tells us which feature is better. 1.0 is perfect.

4. Fisher's Discriminant & LDA 📐

Slides 24-37

Fisher's Discriminant finds the best linear combination of features to separate classes by maximizing the distance between means ($\mu$) and minimizing variance ($\Sigma$).

$$ \vec{w} \propto (\Sigma_0 + \Sigma_1)^{-1}(\vec{\mu}_1 - \vec{\mu}_0) $$

Visualizing Fisher's Separation

LDA with Scikit-Learn

Now let's use the professional tool.

5. Support Vector Machines (SVM) ⚔️

Slides 40-49

SVM maximizes the margin between classes. It focuses on the edge cases (Support Vectors).

Visualizing the Decision Boundary

The hyperplane that separates 0s and 1s.

The Grand Finale: Full MNIST Classification

This might take a few seconds...

6. The Non-Linear Problem 🍩

Slides 51-54

What if data isn't linearly separable? (Like a red dot inside a blue ring). We need the Kernel Trick.

Brief on Machine Learning