05. Linear & Logistic Regression

05. Linear & Logistic Regression

์ด์ „: ํ›ˆ๋ จ ๊ธฐ๋ฒ• | ๋‹ค์Œ: Multi-Layer Perceptron (MLP)


๊ฐœ์š”

์„ ํ˜• ํšŒ๊ท€์™€ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” ๋”ฅ๋Ÿฌ๋‹์˜ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ building block์ž…๋‹ˆ๋‹ค. ์‹ ๊ฒฝ๋ง์˜ ๊ฐ ๋ ˆ์ด์–ด๋Š” ๋ณธ์งˆ์ ์œผ๋กœ ์„ ํ˜• ๋ณ€ํ™˜ + ๋น„์„ ํ˜• ํ™œ์„ฑํ™”์˜ ์กฐํ•ฉ์ž…๋‹ˆ๋‹ค.

ํ•™์Šต ๋ชฉํ‘œ

  1. ์ˆ˜ํ•™์  ์ดํ•ด
  2. Gradient Descent ์›๋ฆฌ
  3. Loss Function (MSE, Cross-Entropy)
  4. ํ–‰๋ ฌ ๋ฏธ๋ถ„

  5. ๊ตฌํ˜„ ๋Šฅ๋ ฅ

  6. Forward/Backward pass ์ง์ ‘ ๊ตฌํ˜„
  7. ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”
  8. ํ•™์Šต ๋ฃจํ”„ ์ž‘์„ฑ

  9. ์‹ค์Šต

  10. MNIST ์ด์ง„ ๋ถ„๋ฅ˜
  11. ๊ณผ์ ํ•ฉ/์ •๊ทœํ™” ์‹คํ—˜

์ˆ˜ํ•™์  ๋ฐฐ๊ฒฝ

1. Linear Regression

๋ชจ๋ธ:    ลท = Xw + b
์†์‹ค:    L = (1/2n) ฮฃ(y - ลท)ยฒ  (MSE)

๊ทธ๋ž˜๋””์–ธํŠธ:
โˆ‚L/โˆ‚w = (1/n) X^T (ลท - y)
โˆ‚L/โˆ‚b = (1/n) ฮฃ(ลท - y)

์—…๋ฐ์ดํŠธ:
w โ† w - ฮท ร— โˆ‚L/โˆ‚w
b โ† b - ฮท ร— โˆ‚L/โˆ‚b

2. Logistic Regression

๋ชจ๋ธ:    z = Xw + b
         ลท = ฯƒ(z) = 1/(1 + e^(-z))

์†์‹ค:    L = -(1/n) ฮฃ[yยทlog(ลท) + (1-y)ยทlog(1-ลท)]  (BCE)

๊ทธ๋ž˜๋””์–ธํŠธ:
โˆ‚L/โˆ‚w = (1/n) X^T (ลท - y)  โ† ๋†€๋ž๊ฒŒ๋„ Linear์™€ ๊ฐ™์€ ํ˜•ํƒœ!
โˆ‚L/โˆ‚b = (1/n) ฮฃ(ลท - y)

ํŒŒ์ผ ๊ตฌ์กฐ

01_Linear_Logistic/
โ”œโ”€โ”€ README.md                 # ์ด ํŒŒ์ผ
โ”œโ”€โ”€ theory.md                 # ์ƒ์„ธ ์ด๋ก  (์ˆ˜ํ•™์  ์œ ๋„)
โ”œโ”€โ”€ numpy/
โ”‚   โ”œโ”€โ”€ linear_numpy.py       # Linear Regression (NumPy)
โ”‚   โ”œโ”€โ”€ logistic_numpy.py     # Logistic Regression (NumPy)
โ”‚   โ””โ”€โ”€ test_numpy.py         # ๋‹จ์œ„ ํ…Œ์ŠคํŠธ
โ”œโ”€โ”€ pytorch_lowlevel/
โ”‚   โ”œโ”€โ”€ linear_lowlevel.py    # PyTorch ๊ธฐ๋ณธ ops ์‚ฌ์šฉ
โ”‚   โ””โ”€โ”€ logistic_lowlevel.py
โ”œโ”€โ”€ paper/
โ”‚   โ””โ”€โ”€ linear_paper.py       # ํด๋ฆฐํ•œ nn.Module ๊ตฌํ˜„
โ””โ”€โ”€ exercises/
    โ”œโ”€โ”€ 01_regularization.md  # L1/L2 ์ •๊ทœํ™” ์ถ”๊ฐ€
    โ””โ”€โ”€ 02_softmax.md         # Softmax ํ™•์žฅ

๋น ๋ฅธ ์‹œ์ž‘

NumPy ๊ตฌํ˜„ ์‹คํ–‰

cd numpy/
python linear_numpy.py      # ์„ ํ˜• ํšŒ๊ท€ ํ•™์Šต
python logistic_numpy.py    # ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ํ•™์Šต
python test_numpy.py        # ํ…Œ์ŠคํŠธ ์‹คํ–‰

PyTorch ๊ตฌํ˜„ ์‹คํ–‰

cd pytorch_lowlevel/
python linear_lowlevel.py

ํ•ต์‹ฌ ๊ฐœ๋…

1. Gradient Descent

# ๊ธฐ๋ณธ ์•Œ๊ณ ๋ฆฌ์ฆ˜
for epoch in range(n_epochs):
    # Forward
    y_pred = model.forward(X)

    # Loss
    loss = compute_loss(y, y_pred)

    # Backward (gradient ๊ณ„์‚ฐ)
    gradients = compute_gradients(y, y_pred)

    # Update
    model.weights -= learning_rate * gradients

2. ํ–‰๋ ฌ ๋ฏธ๋ถ„ (์ค‘์š”!)

โˆ‚(Xw)/โˆ‚w = X^T
โˆ‚(w^T X^T)/โˆ‚w = X
โˆ‚(||Xw - y||ยฒ)/โˆ‚w = 2 X^T (Xw - y)

3. Sigmoid์™€ ๊ทธ ๋ฏธ๋ถ„

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

def sigmoid_derivative(z):
    s = sigmoid(z)
    return s * (1 - s)  # ฯƒ(z)(1 - ฯƒ(z))

์—ฐ์Šต ๋ฌธ์ œ

๊ธฐ์ดˆ

  1. Linear Regression์— bias ์—†์ด ๊ตฌํ˜„ํ•ด๋ณด๊ธฐ
  2. ํ•™์Šต๋ฅ (lr)์„ ๋ฐ”๊พธ๋ฉฐ ์ˆ˜๋ ด ์†๋„ ๊ด€์ฐฐ
  3. Batch vs Stochastic Gradient Descent ๋น„๊ต

์ค‘๊ธ‰

  1. L2 ์ •๊ทœํ™” ์ถ”๊ฐ€ (Ridge)
  2. L1 ์ •๊ทœํ™” ์ถ”๊ฐ€ (Lasso)
  3. Mini-batch GD ๊ตฌํ˜„

๊ณ ๊ธ‰

  1. Momentum, Adam ์˜ตํ‹ฐ๋งˆ์ด์ € ๊ตฌํ˜„
  2. Early Stopping ๊ตฌํ˜„
  3. Softmax Regression (๋‹ค์ค‘ ํด๋ž˜์Šค) ํ™•์žฅ

์ฐธ๊ณ  ์ž๋ฃŒ

to navigate between lessons