05. Linear & Logistic Regression

05. Linear & Logistic Regression

Previous: Training Techniques | Next: Multi-Layer Perceptron (MLP)


Overview

Linear regression and logistic regression are the most fundamental building blocks of deep learning. Each layer of a neural network is essentially a combination of linear transformation + nonlinear activation.

Learning Objectives

  1. Mathematical Understanding
  2. Gradient Descent principles
  3. Loss Functions (MSE, Cross-Entropy)
  4. Matrix differentiation

  5. Implementation Skills

  6. Direct implementation of Forward/Backward pass
  7. Weight initialization
  8. Writing training loops

  9. Practice

  10. MNIST binary classification
  11. Overfitting/regularization experiments

Mathematical Background

1. Linear Regression

Model:    ลท = Xw + b
Loss:     L = (1/2n) ฮฃ(y - ลท)ยฒ  (MSE)

Gradients:
โˆ‚L/โˆ‚w = (1/n) X^T (ลท - y)
โˆ‚L/โˆ‚b = (1/n) ฮฃ(ลท - y)

Update:
w โ† w - ฮท ร— โˆ‚L/โˆ‚w
b โ† b - ฮท ร— โˆ‚L/โˆ‚b

2. Logistic Regression

Model:    z = Xw + b
          ลท = ฯƒ(z) = 1/(1 + e^(-z))

Loss:     L = -(1/n) ฮฃ[yยทlog(ลท) + (1-y)ยทlog(1-ลท)]  (BCE)

Gradients:
โˆ‚L/โˆ‚w = (1/n) X^T (ลท - y)  โ† Surprisingly, same form as Linear!
โˆ‚L/โˆ‚b = (1/n) ฮฃ(ลท - y)

File Structure

01_Linear_Logistic/
โ”œโ”€โ”€ README.md                 # This file
โ”œโ”€โ”€ theory.md                 # Detailed theory (mathematical derivations)
โ”œโ”€โ”€ numpy/
โ”‚   โ”œโ”€โ”€ linear_numpy.py       # Linear Regression (NumPy)
โ”‚   โ”œโ”€โ”€ logistic_numpy.py     # Logistic Regression (NumPy)
โ”‚   โ””โ”€โ”€ test_numpy.py         # Unit tests
โ”œโ”€โ”€ pytorch_lowlevel/
โ”‚   โ”œโ”€โ”€ linear_lowlevel.py    # Using PyTorch basic ops
โ”‚   โ””โ”€โ”€ logistic_lowlevel.py
โ”œโ”€โ”€ paper/
โ”‚   โ””โ”€โ”€ linear_paper.py       # Clean nn.Module implementation
โ””โ”€โ”€ exercises/
    โ”œโ”€โ”€ 01_regularization.md  # Add L1/L2 regularization
    โ””โ”€โ”€ 02_softmax.md         # Extend to Softmax

Quick Start

Running NumPy Implementation

cd numpy/
python linear_numpy.py      # Train linear regression
python logistic_numpy.py    # Train logistic regression
python test_numpy.py        # Run tests

Running PyTorch Implementation

cd pytorch_lowlevel/
python linear_lowlevel.py

Core Concepts

1. Gradient Descent

# Basic algorithm
for epoch in range(n_epochs):
    # Forward
    y_pred = model.forward(X)

    # Loss
    loss = compute_loss(y, y_pred)

    # Backward (compute gradients)
    gradients = compute_gradients(y, y_pred)

    # Update
    model.weights -= learning_rate * gradients

2. Matrix Differentiation (Important!)

โˆ‚(Xw)/โˆ‚w = X^T
โˆ‚(w^T X^T)/โˆ‚w = X
โˆ‚(||Xw - y||ยฒ)/โˆ‚w = 2 X^T (Xw - y)

3. Sigmoid and Its Derivative

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

def sigmoid_derivative(z):
    s = sigmoid(z)
    return s * (1 - s)  # ฯƒ(z)(1 - ฯƒ(z))

Practice Problems

Basic

  1. Implement Linear Regression without bias
  2. Observe convergence speed with different learning rates (lr)
  3. Compare Batch vs Stochastic Gradient Descent

Intermediate

  1. Add L2 regularization (Ridge)
  2. Add L1 regularization (Lasso)
  3. Implement Mini-batch GD

Advanced

  1. Implement Momentum, Adam optimizers
  2. Implement Early Stopping
  3. Extend to Softmax Regression (multi-class)

References

to navigate between lessons