02. ํ–‰๋ ฌ ๋ถ„ํ•ด (Matrix Decompositions)

02. ํ–‰๋ ฌ ๋ถ„ํ•ด (Matrix Decompositions)

ํ•™์Šต ๋ชฉํ‘œ

  • ๊ณ ์œ ๊ฐ’๊ณผ ๊ณ ์œ ๋ฒกํ„ฐ์˜ ๊ฐœ๋…์„ ์ดํ•ดํ•˜๊ณ  ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค
  • ๋Œ€์นญ ํ–‰๋ ฌ์˜ ์ŠคํŽ™ํŠธ๋Ÿผ ์ •๋ฆฌ์™€ ์–‘์ •์น˜ ํ–‰๋ ฌ์˜ ์„ฑ์งˆ์„ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋‹ค
  • ํŠน์ด๊ฐ’ ๋ถ„ํ•ด(SVD)์˜ ๊ธฐํ•˜ํ•™์  ์˜๋ฏธ๋ฅผ ์ดํ•ดํ•˜๊ณ  ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค
  • SVD๋ฅผ ์ด์šฉํ•œ ์ €๋žญํฌ ๊ทผ์‚ฌ์™€ PCA์˜ ์ˆ˜ํ•™์  ์›๋ฆฌ๋ฅผ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋‹ค
  • LU, QR, Cholesky ๋ถ„ํ•ด์˜ ์šฉ๋„์™€ ๊ณ„์‚ฐ ๋ฐฉ๋ฒ•์„ ์ดํ•ดํ•œ๋‹ค
  • ํ–‰๋ ฌ ๋ถ„ํ•ด๊ฐ€ ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ์–ด๋–ป๊ฒŒ ํ™œ์šฉ๋˜๋Š”์ง€ ๊ตฌ์ฒด์ ์ธ ์˜ˆ๋ฅผ ๋“ค ์ˆ˜ ์žˆ๋‹ค

1. ๊ณ ์œ ๊ฐ’๊ณผ ๊ณ ์œ ๋ฒกํ„ฐ

1.1 ์ •์˜์™€ ์˜๋ฏธ

ํ–‰๋ ฌ $A$์˜ ๊ณ ์œ ๋ฒกํ„ฐ(eigenvector) $\mathbf{v}$๋Š” ๋‹ค์Œ์„ ๋งŒ์กฑํ•˜๋Š” 0์ด ์•„๋‹Œ ๋ฒกํ„ฐ์ž…๋‹ˆ๋‹ค:

$$A\mathbf{v} = \lambda\mathbf{v}$$

์—ฌ๊ธฐ์„œ $\lambda$๋Š” ๊ณ ์œ ๊ฐ’(eigenvalue)์ž…๋‹ˆ๋‹ค.

๊ธฐํ•˜ํ•™์  ํ•ด์„: ๊ณ ์œ ๋ฒกํ„ฐ๋Š” ์„ ํ˜• ๋ณ€ํ™˜ $A$์— ์˜ํ•ด ๋ฐฉํ–ฅ์€ ๋ฐ”๋€Œ์ง€ ์•Š๊ณ  ํฌ๊ธฐ๋งŒ $\lambda$๋ฐฐ ๋ณ€ํ•˜๋Š” ํŠน๋ณ„ํ•œ ๋ฐฉํ–ฅ์ž…๋‹ˆ๋‹ค.

import numpy as np
import matplotlib.pyplot as plt

# 2x2 ํ–‰๋ ฌ์˜ ๊ณ ์œ ๊ฐ’๊ณผ ๊ณ ์œ ๋ฒกํ„ฐ
A = np.array([[3, 1],
              [0, 2]])

eigenvalues, eigenvectors = np.linalg.eig(A)
print(f"Eigenvalues: {eigenvalues}")
print(f"Eigenvectors:\n{eigenvectors}")

# ๊ฒ€์ฆ: Av = ฮปv
for i in range(len(eigenvalues)):
    v = eigenvectors[:, i]
    lam = eigenvalues[i]
    Av = A @ v
    lam_v = lam * v
    print(f"\nEigenvector {i+1}:")
    print(f"Av = {Av}")
    print(f"ฮปv = {lam_v}")
    print(f"Equal? {np.allclose(Av, lam_v)}")

1.2 ํŠน์„ฑ๋ฐฉ์ •์‹ (Characteristic Equation)

๊ณ ์œ ๊ฐ’์€ ํŠน์„ฑ๋ฐฉ์ •์‹์˜ ํ•ด์ž…๋‹ˆ๋‹ค:

$$\det(A - \lambda I) = 0$$

์ด๋Š” $n$์ฐจ ๋‹คํ•ญ์‹์ด๋ฉฐ, $n$๊ฐœ์˜ ๊ณ ์œ ๊ฐ’์„ ๊ฐ€์ง‘๋‹ˆ๋‹ค (์ค‘๊ทผ ํฌํ•จ).

# ํŠน์„ฑ๋ฐฉ์ •์‹ ์ˆ˜๋™ ๊ณ„์‚ฐ (2x2 ์˜ˆ์ œ)
A = np.array([[4, 2],
              [1, 3]])

# det(A - ฮปI) = 0
# (4-ฮป)(3-ฮป) - 2*1 = 0
# ฮปยฒ - 7ฮป + 10 = 0
# (ฮป-5)(ฮป-2) = 0

eigenvalues = np.linalg.eigvals(A)
print(f"Eigenvalues: {eigenvalues}")  # [5, 2]

# ๊ฒ€์ฆ: ํŠน์„ฑ๋ฐฉ์ •์‹
for lam in eigenvalues:
    det = np.linalg.det(A - lam * np.eye(2))
    print(f"det(A - {lam}I) = {det:.10f}")  # ~0

1.3 ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด (Eigendecomposition)

$n \times n$ ํ–‰๋ ฌ $A$๊ฐ€ $n$๊ฐœ์˜ ์„ ํ˜• ๋…๋ฆฝ์ธ ๊ณ ์œ ๋ฒกํ„ฐ๋ฅผ ๊ฐ€์ง€๋ฉด:

$$A = V\Lambda V^{-1}$$

์—ฌ๊ธฐ์„œ: - $V$: ๊ณ ์œ ๋ฒกํ„ฐ๋ฅผ ์—ด๋กœ ํ•˜๋Š” ํ–‰๋ ฌ - $\Lambda$: ๋Œ€๊ฐ์„ ์— ๊ณ ์œ ๊ฐ’์„ ๊ฐ€์ง€๋Š” ๋Œ€๊ฐ ํ–‰๋ ฌ

# ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด
A = np.array([[3, 1],
              [0, 2]])

eigenvalues, V = np.linalg.eig(A)
Lambda = np.diag(eigenvalues)

print(f"V (eigenvectors):\n{V}")
print(f"ฮ› (eigenvalues):\n{Lambda}")

# ์žฌ๊ตฌ์„ฑ: A = Vฮ›V^(-1)
A_reconstructed = V @ Lambda @ np.linalg.inv(V)
print(f"A reconstructed:\n{A_reconstructed}")
print(f"Original A:\n{A}")
print(f"Equal? {np.allclose(A, A_reconstructed)}")

1.4 ๊ฑฐ๋“ญ์ œ๊ณฑ๊ณผ ํ–‰๋ ฌ ํ•จ์ˆ˜

๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด๋ฅผ ์ด์šฉํ•˜๋ฉด ํ–‰๋ ฌ ๊ฑฐ๋“ญ์ œ๊ณฑ์ด ์‰ฌ์›Œ์ง‘๋‹ˆ๋‹ค:

$$A^k = V\Lambda^k V^{-1}$$

ํ–‰๋ ฌ ํ•จ์ˆ˜๋„ ์ •์˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

$$f(A) = Vf(\Lambda)V^{-1}$$

# ํ–‰๋ ฌ ๊ฑฐ๋“ญ์ œ๊ณฑ
A = np.array([[2, 1],
              [1, 2]])

eigenvalues, V = np.linalg.eig(A)
Lambda = np.diag(eigenvalues)

# A^10 ๊ณ„์‚ฐ (๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•)
# ๋ฐฉ๋ฒ• 1: ์ง์ ‘ ๊ณฑ์…ˆ
A_10_direct = np.linalg.matrix_power(A, 10)

# ๋ฐฉ๋ฒ• 2: ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด ์ด์šฉ
Lambda_10 = np.diag(eigenvalues**10)
A_10_eigen = V @ Lambda_10 @ np.linalg.inv(V)

print(f"A^10 (direct):\n{A_10_direct}")
print(f"A^10 (eigendecomposition):\n{A_10_eigen.real}")
print(f"Equal? {np.allclose(A_10_direct, A_10_eigen.real)}")

# ํ–‰๋ ฌ ์ง€์ˆ˜ ํ•จ์ˆ˜
from scipy.linalg import expm
exp_A_scipy = expm(A)
exp_Lambda = np.diag(np.exp(eigenvalues))
exp_A_eigen = V @ exp_Lambda @ np.linalg.inv(V)
print(f"exp(A) match? {np.allclose(exp_A_scipy, exp_A_eigen.real)}")

2. ๋Œ€์นญ ํ–‰๋ ฌ์˜ ์ŠคํŽ™ํŠธ๋Ÿผ ์ •๋ฆฌ

2.1 ์ŠคํŽ™ํŠธ๋Ÿผ ์ •๋ฆฌ (Spectral Theorem)

์‹ค์ˆ˜ ๋Œ€์นญ ํ–‰๋ ฌ $A = A^T$์— ๋Œ€ํ•ด: 1. ๋ชจ๋“  ๊ณ ์œ ๊ฐ’์ด ์‹ค์ˆ˜ 2. ์„œ๋กœ ๋‹ค๋ฅธ ๊ณ ์œ ๊ฐ’์— ๋Œ€์‘ํ•˜๋Š” ๊ณ ์œ ๋ฒกํ„ฐ๋“ค์€ ์ง๊ต 3. ์ง๊ต ๋Œ€๊ฐํ™” ๊ฐ€๋Šฅ: $A = Q\Lambda Q^T$ (์—ฌ๊ธฐ์„œ $Q$๋Š” ์ง๊ต ํ–‰๋ ฌ)

# ๋Œ€์นญ ํ–‰๋ ฌ์˜ ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด
A_sym = np.array([[4, 2, 0],
                  [2, 3, 1],
                  [0, 1, 2]])

# ๋Œ€์นญ ํ–‰๋ ฌ์ธ์ง€ ํ™•์ธ
print(f"Is symmetric? {np.allclose(A_sym, A_sym.T)}")

eigenvalues, Q = np.linalg.eigh(A_sym)  # ๋Œ€์นญ ํ–‰๋ ฌ ์ „์šฉ ํ•จ์ˆ˜
Lambda = np.diag(eigenvalues)

print(f"Eigenvalues: {eigenvalues}")
print(f"Q (orthogonal matrix):\n{Q}")

# ์ง๊ต์„ฑ ํ™•์ธ
print(f"Q^T Q = I?\n{Q.T @ Q}")
print(f"Is orthogonal? {np.allclose(Q.T @ Q, np.eye(3))}")

# ์žฌ๊ตฌ์„ฑ: A = Qฮ›Q^T
A_reconstructed = Q @ Lambda @ Q.T
print(f"Reconstruction error: {np.linalg.norm(A_sym - A_reconstructed)}")

2.2 ์–‘์ •์น˜ ํ–‰๋ ฌ (Positive Definite Matrix)

๋Œ€์นญ ํ–‰๋ ฌ $A$๊ฐ€ ์–‘์ •์น˜(positive definite)์ด๋ ค๋ฉด:

$$\mathbf{x}^T A \mathbf{x} > 0 \quad \forall \mathbf{x} \neq \mathbf{0}$$

๋™์น˜ ์กฐ๊ฑด: - ๋ชจ๋“  ๊ณ ์œ ๊ฐ’์ด ์–‘์ˆ˜ - ๋ชจ๋“  ์ฃผ ์†Œํ–‰๋ ฌ์‹(leading principal minor)์ด ์–‘์ˆ˜ - Cholesky ๋ถ„ํ•ด ๊ฐ€๋Šฅ

์–‘์ •์น˜ ํ–‰๋ ฌ์€ ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค (๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ, ํ—ค์‹œ์•ˆ ๋“ฑ).

# ์–‘์ •์น˜ ํ–‰๋ ฌ ์˜ˆ์ œ
A_pd = np.array([[2, -1, 0],
                 [-1, 2, -1],
                 [0, -1, 2]])

eigenvalues = np.linalg.eigvalsh(A_pd)
print(f"Eigenvalues: {eigenvalues}")
print(f"All positive? {np.all(eigenvalues > 0)}")

# x^T A x > 0 ํ™•์ธ
x = np.random.randn(3)
quadratic_form = x.T @ A_pd @ x
print(f"x^T A x = {quadratic_form}")
print(f"Positive? {quadratic_form > 0}")

# ๋ฐ˜์–‘์ •์น˜(positive semidefinite) ์˜ˆ์ œ
A_psd = np.array([[1, 1],
                  [1, 1]])
eigenvalues_psd = np.linalg.eigvalsh(A_psd)
print(f"PSD eigenvalues: {eigenvalues_psd}")  # [0, 2]
print(f"All non-negative? {np.all(eigenvalues_psd >= 0)}")

2.3 ๋Œ€์นญ ํ–‰๋ ฌ์˜ ๊ธฐํ•˜ํ•™์  ํ•ด์„

๋Œ€์นญ ํ–‰๋ ฌ์€ ํƒ€์›์ฒด(ellipsoid)๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ๊ณ ์œ ๋ฒกํ„ฐ๋Š” ์ฃผ์ถ•(principal axes)์˜ ๋ฐฉํ–ฅ์ด๊ณ , ๊ณ ์œ ๊ฐ’์€ ๊ฐ ์ถ•์˜ ๊ธธ์ด์— ๋Œ€์‘๋ฉ๋‹ˆ๋‹ค.

# 2D ํƒ€์› ์‹œ๊ฐํ™”
A = np.array([[3, 1],
              [1, 2]])

eigenvalues, eigenvectors = np.linalg.eigh(A)

# ๋‹จ์œ„์›์„ A๋กœ ๋ณ€ํ™˜
theta = np.linspace(0, 2*np.pi, 100)
unit_circle = np.array([np.cos(theta), np.sin(theta)])
ellipse = A @ unit_circle

# ์‹œ๊ฐํ™”
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))

# ๋‹จ์œ„์›
ax1.plot(unit_circle[0], unit_circle[1], 'b-', label='Unit circle')
ax1.set_aspect('equal')
ax1.grid(True)
ax1.legend()
ax1.set_title('Before transformation')

# ํƒ€์›
ax2.plot(ellipse[0], ellipse[1], 'r-', label='Ellipse')
# ๊ณ ์œ ๋ฒกํ„ฐ ๊ทธ๋ฆฌ๊ธฐ
for i in range(2):
    scale = np.sqrt(eigenvalues[i])
    v = eigenvectors[:, i] * scale
    ax2.arrow(0, 0, v[0], v[1], head_width=0.2, head_length=0.2,
              fc=f'C{i}', ec=f'C{i}', label=f'โˆšฮป{i+1} * v{i+1}')
ax2.set_aspect('equal')
ax2.grid(True)
ax2.legend()
ax2.set_title('After transformation by A')

plt.tight_layout()
plt.show()

3. ํŠน์ด๊ฐ’ ๋ถ„ํ•ด (Singular Value Decomposition, SVD)

3.1 SVD์˜ ์ •์˜

๋ชจ๋“  $m \times n$ ํ–‰๋ ฌ $A$๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ถ„ํ•ด๋ฉ๋‹ˆ๋‹ค:

$$A = U\Sigma V^T$$

์—ฌ๊ธฐ์„œ: - $U$: $m \times m$ ์ง๊ต ํ–‰๋ ฌ (left singular vectors) - $\Sigma$: $m \times n$ ๋Œ€๊ฐ ํ–‰๋ ฌ (singular values, $\sigma_1 \geq \sigma_2 \geq \cdots \geq 0$) - $V$: $n \times n$ ์ง๊ต ํ–‰๋ ฌ (right singular vectors)

# SVD ๊ณ„์‚ฐ
A = np.array([[3, 1, 1],
              [-1, 3, 1]])

U, sigma, VT = np.linalg.svd(A, full_matrices=True)

print(f"A shape: {A.shape}")
print(f"U shape: {U.shape}")
print(f"sigma: {sigma}")  # 1D ๋ฐฐ์—ด
print(f"V^T shape: {VT.shape}")

# ฮฃ ํ–‰๋ ฌ ์žฌ๊ตฌ์„ฑ (m x n)
Sigma = np.zeros_like(A, dtype=float)
Sigma[:min(A.shape), :min(A.shape)] = np.diag(sigma)

print(f"Sigma shape: {Sigma.shape}")
print(f"Sigma:\n{Sigma}")

# ์žฌ๊ตฌ์„ฑ ํ™•์ธ
A_reconstructed = U @ Sigma @ VT
print(f"Reconstruction error: {np.linalg.norm(A - A_reconstructed)}")

3.2 SVD์˜ ๊ธฐํ•˜ํ•™์  ์˜๋ฏธ

SVD๋Š” ๋ชจ๋“  ์„ ํ˜• ๋ณ€ํ™˜์„ ์„ธ ๋‹จ๊ณ„๋กœ ๋ถ„ํ•ดํ•ฉ๋‹ˆ๋‹ค: 1. $V^T$: ํšŒ์ „/๋ฐ˜์‚ฌ 2. $\Sigma$: ์ถ• ๋ฐฉํ–ฅ ์Šค์ผ€์ผ๋ง 3. $U$: ํšŒ์ „/๋ฐ˜์‚ฌ

  V^T          ฮฃ           U
 โ”€โ”€โ”€โ”€โ”€โ”€โ”€    โ”€โ”€โ”€โ”€โ”€โ”€โ”€    โ”€โ”€โ”€โ”€โ”€โ”€โ”€
 Rotate  โ†’  Scale   โ†’  Rotate
# SVD์˜ ๊ธฐํ•˜ํ•™์  ํ•ด์„
A = np.array([[3, 1],
              [1, 2]])

U, sigma, VT = np.linalg.svd(A)

# ๋‹จ์œ„์›
theta = np.linspace(0, 2*np.pi, 100)
circle = np.array([np.cos(theta), np.sin(theta)])

# ๊ฐ ๋‹จ๊ณ„ ์ ์šฉ
step1 = VT @ circle  # ์ฒซ ๋ฒˆ์งธ ํšŒ์ „
Sigma_2d = np.diag(sigma)
step2 = Sigma_2d @ step1  # ์Šค์ผ€์ผ๋ง
step3 = U @ step2  # ๋‘ ๋ฒˆ์งธ ํšŒ์ „

# ์ตœ์ข… ๊ฒฐ๊ณผ (ํ•œ๋ฒˆ์—)
final = A @ circle

# ์‹œ๊ฐํ™”
fig, axes = plt.subplots(2, 3, figsize=(15, 10))

axes[0, 0].plot(circle[0], circle[1], 'b-')
axes[0, 0].set_title('1. Original circle')
axes[0, 0].set_aspect('equal')
axes[0, 0].grid(True)

axes[0, 1].plot(step1[0], step1[1], 'g-')
axes[0, 1].set_title('2. After V^T (rotation)')
axes[0, 1].set_aspect('equal')
axes[0, 1].grid(True)

axes[0, 2].plot(step2[0], step2[1], 'm-')
axes[0, 2].set_title('3. After ฮฃ (scaling)')
axes[0, 2].set_aspect('equal')
axes[0, 2].grid(True)

axes[1, 0].plot(step3[0], step3[1], 'r-')
axes[1, 0].set_title('4. After U (rotation)')
axes[1, 0].set_aspect('equal')
axes[1, 0].grid(True)

axes[1, 1].plot(final[0], final[1], 'k-', linewidth=2)
axes[1, 1].set_title('5. Final: A @ circle')
axes[1, 1].set_aspect('equal')
axes[1, 1].grid(True)

# ๊ฒ€์ฆ
axes[1, 2].plot(step3[0], step3[1], 'r-', alpha=0.5, label='SVD steps')
axes[1, 2].plot(final[0], final[1], 'k--', label='Direct')
axes[1, 2].set_title('Verification')
axes[1, 2].set_aspect('equal')
axes[1, 2].grid(True)
axes[1, 2].legend()

plt.tight_layout()
plt.show()

3.3 SVD์™€ ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด์˜ ๊ด€๊ณ„

$A^TA$์™€ $AA^T$์˜ ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด๋กœ๋ถ€ํ„ฐ SVD๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

$$A^TA = V\Sigma^T\Sigma V^T = V\Sigma^2 V^T$$ $$AA^T = U\Sigma\Sigma^T U^T = U\Sigma^2 U^T$$

์ฆ‰: - $V$๋Š” $A^TA$์˜ ๊ณ ์œ ๋ฒกํ„ฐ - $U$๋Š” $AA^T$์˜ ๊ณ ์œ ๋ฒกํ„ฐ - $\sigma_i^2$๋Š” $A^TA$ (๋˜๋Š” $AA^T$)์˜ ๊ณ ์œ ๊ฐ’

# SVD์™€ ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด์˜ ๊ด€๊ณ„
A = np.array([[3, 1, 1],
              [-1, 3, 1]])

# ๋ฐฉ๋ฒ• 1: ์ง์ ‘ SVD
U_svd, sigma_svd, VT_svd = np.linalg.svd(A)

# ๋ฐฉ๋ฒ• 2: A^T A์˜ ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด
ATA = A.T @ A
eigenvalues_ATA, V_eigen = np.linalg.eigh(ATA)
# ๊ณ ์œ ๊ฐ’์„ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌ
idx = eigenvalues_ATA.argsort()[::-1]
eigenvalues_ATA = eigenvalues_ATA[idx]
V_eigen = V_eigen[:, idx]

sigma_from_eigen = np.sqrt(eigenvalues_ATA)

print(f"Singular values (SVD): {sigma_svd}")
print(f"sqrt(eigenvalues of A^T A): {sigma_from_eigen}")
print(f"V from SVD:\n{VT_svd.T}")
print(f"V from eigendecomposition:\n{V_eigen}")

3.4 ์ €๋žญํฌ ๊ทผ์‚ฌ (Low-Rank Approximation)

Eckart-Young-Mirsky ์ •๋ฆฌ: ๋žญํฌ $k$ ๊ทผ์‚ฌ ์ค‘ Frobenius ๋…ธ๋ฆ„ ์˜๋ฏธ์—์„œ ์ตœ์„ ์˜ ๊ทผ์‚ฌ๋Š”:

$$A_k = \sum_{i=1}^k \sigma_i \mathbf{u}_i \mathbf{v}_i^T = U_k \Sigma_k V_k^T$$

# ์ €๋žญํฌ ๊ทผ์‚ฌ ์˜ˆ์ œ: ์ด๋ฏธ์ง€ ์••์ถ•
from sklearn.datasets import load_sample_image
import matplotlib.pyplot as plt

# ์ƒ˜ํ”Œ ์ด๋ฏธ์ง€ ๋กœ๋“œ
china = load_sample_image("china.jpg")
# ๊ทธ๋ ˆ์ด์Šค์ผ€์ผ๋กœ ๋ณ€ํ™˜
china_gray = china.mean(axis=2)

# SVD
U, sigma, VT = np.linalg.svd(china_gray, full_matrices=False)

# ๋‹ค์–‘ํ•œ ๋žญํฌ๋กœ ๊ทผ์‚ฌ
ranks = [5, 20, 50, 100]
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.ravel()

axes[0].imshow(china_gray, cmap='gray')
axes[0].set_title(f'Original (rank={np.linalg.matrix_rank(china_gray)})')
axes[0].axis('off')

for i, k in enumerate(ranks, 1):
    # ๋žญํฌ k ๊ทผ์‚ฌ
    A_k = U[:, :k] @ np.diag(sigma[:k]) @ VT[:k, :]

    axes[i].imshow(A_k, cmap='gray')

    # ์••์ถ•๋ฅ  ๊ณ„์‚ฐ
    original_size = china_gray.size
    compressed_size = k * (U.shape[0] + VT.shape[1] + 1)
    compression_ratio = original_size / compressed_size

    axes[i].set_title(f'Rank {k} (compression: {compression_ratio:.1f}x)')
    axes[i].axis('off')

# ํŠน์ด๊ฐ’ ๋ถ„ํฌ
axes[-1].plot(sigma, 'b-')
axes[-1].set_xlabel('Index')
axes[-1].set_ylabel('Singular value')
axes[-1].set_title('Singular value spectrum')
axes[-1].set_yscale('log')
axes[-1].grid(True)

plt.tight_layout()
plt.show()

4. PCA: SVD์—์„œ ์ฃผ์„ฑ๋ถ„ ๋ถ„์„์œผ๋กœ

4.1 ๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ (Covariance Matrix)

๋ฐ์ดํ„ฐ ํ–‰๋ ฌ $X$ ($n$ ์ƒ˜ํ”Œ ร— $d$ ํ”ผ์ฒ˜)์— ๋Œ€ํ•ด, ๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ์€:

$$C = \frac{1}{n-1}X^TX \quad \text{(๋ฐ์ดํ„ฐ๊ฐ€ ์ค‘์‹ฌํ™”๋˜์–ด ์žˆ๋‹ค๊ณ  ๊ฐ€์ •)}$$

๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ์€ ๋Œ€์นญ์ด๊ณ  ๋ฐ˜์–‘์ •์น˜์ž…๋‹ˆ๋‹ค.

# ๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ
np.random.seed(42)
X = np.random.randn(100, 3)

# ๋ฐ์ดํ„ฐ ์ค‘์‹ฌํ™”
X_centered = X - X.mean(axis=0)

# ๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ (๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•)
cov_manual = (X_centered.T @ X_centered) / (len(X) - 1)
cov_numpy = np.cov(X.T)

print(f"Manual covariance:\n{cov_manual}")
print(f"NumPy covariance:\n{cov_numpy}")
print(f"Equal? {np.allclose(cov_manual, cov_numpy)}")

# ๋Œ€์นญ์„ฑ ํ™•์ธ
print(f"Is symmetric? {np.allclose(cov_manual, cov_manual.T)}")

4.2 PCA์˜ ์ˆ˜ํ•™์  ์œ ๋„

๋ชฉํ‘œ: ๋ฐ์ดํ„ฐ์˜ ๋ถ„์‚ฐ์„ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๋ฐฉํ–ฅ(์ฃผ์„ฑ๋ถ„)์„ ์ฐพ๊ธฐ

์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„ $\mathbf{w}_1$์€ ๋‹ค์Œ ๋ฌธ์ œ์˜ ํ•ด:

$$\max_{\mathbf{w}} \mathbf{w}^T C \mathbf{w} \quad \text{subject to } \|\mathbf{w}\| = 1$$

ํ•ด: ๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ $C$์˜ ๊ฐ€์žฅ ํฐ ๊ณ ์œ ๊ฐ’์— ๋Œ€์‘ํ•˜๋Š” ๊ณ ์œ ๋ฒกํ„ฐ

$k$๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์€ ์ด์ „ $k-1$๊ฐœ์— ์ง๊ตํ•˜๋ฉด์„œ ๋ถ„์‚ฐ์„ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๋ฐฉํ–ฅ์ž…๋‹ˆ๋‹ค.

# PCA ์ˆ˜๋™ ๊ตฌํ˜„
from sklearn.datasets import load_iris

iris = load_iris()
X = iris.data

# 1. ๋ฐ์ดํ„ฐ ์ค‘์‹ฌํ™”
X_centered = X - X.mean(axis=0)

# 2. ๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ
cov = np.cov(X_centered.T)

# 3. ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด
eigenvalues, eigenvectors = np.linalg.eigh(cov)

# 4. ๋‚ด๋ฆผ์ฐจ์ˆœ ์ •๋ ฌ
idx = eigenvalues.argsort()[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]

print(f"Eigenvalues (variances): {eigenvalues}")
print(f"Principal components (eigenvectors):\n{eigenvectors}")

# 5. ๋ณ€ํ™˜ (์ฒ˜์Œ 2๊ฐœ ์ฃผ์„ฑ๋ถ„์œผ๋กœ ํˆฌ์˜)
k = 2
X_pca = X_centered @ eigenvectors[:, :k]

print(f"Transformed data shape: {X_pca.shape}")

4.3 SVD๋ฅผ ์ด์šฉํ•œ PCA

๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ์„ ๋ช…์‹œ์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜์ง€ ์•Š๊ณ  SVD๋ฅผ ์ด์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

$$X = U\Sigma V^T \quad \Rightarrow \quad X^TX = V\Sigma^2 V^T$$

๋”ฐ๋ผ์„œ $V$์˜ ์—ด๋ฒกํ„ฐ๋“ค์ด ์ฃผ์„ฑ๋ถ„์ด๊ณ , $\sigma_i^2/(n-1)$์ด ๋ถ„์‚ฐ์ž…๋‹ˆ๋‹ค.

# SVD๋ฅผ ์ด์šฉํ•œ PCA
X_centered = X - X.mean(axis=0)

# SVD
U, sigma, VT = np.linalg.svd(X_centered, full_matrices=False)

# ์ฃผ์„ฑ๋ถ„ = V์˜ ์—ด๋ฒกํ„ฐ
principal_components = VT.T
print(f"Principal components (from SVD):\n{principal_components}")

# ์„ค๋ช…๋œ ๋ถ„์‚ฐ
explained_variance = (sigma**2) / (len(X) - 1)
print(f"Explained variance: {explained_variance}")

# ๋ถ„์‚ฐ ๋น„์œจ
explained_variance_ratio = explained_variance / explained_variance.sum()
print(f"Explained variance ratio: {explained_variance_ratio}")

# ๋ณ€ํ™˜
X_pca_svd = X_centered @ principal_components[:, :2]
print(f"Same as eigendecomposition? {np.allclose(np.abs(X_pca), np.abs(X_pca_svd))}")

4.4 PCA ์‹œ๊ฐํ™”์™€ ํ•ด์„

from sklearn.decomposition import PCA as SklearnPCA

# scikit-learn PCA
pca = SklearnPCA(n_components=2)
X_sklearn = pca.fit_transform(X)

# ์‹œ๊ฐํ™”
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

# ๋ณ€ํ™˜๋œ ๋ฐ์ดํ„ฐ
for i, target_name in enumerate(iris.target_names):
    mask = iris.target == i
    axes[0].scatter(X_sklearn[mask, 0], X_sklearn[mask, 1],
                   label=target_name, alpha=0.7)
axes[0].set_xlabel(f'PC1 ({pca.explained_variance_ratio_[0]:.2%})')
axes[0].set_ylabel(f'PC2 ({pca.explained_variance_ratio_[1]:.2%})')
axes[0].set_title('PCA of Iris Dataset')
axes[0].legend()
axes[0].grid(True)

# ์„ค๋ช…๋œ ๋ถ„์‚ฐ
axes[1].bar(range(1, len(pca.explained_variance_ratio_)+1),
           pca.explained_variance_ratio_)
axes[1].set_xlabel('Principal Component')
axes[1].set_ylabel('Explained Variance Ratio')
axes[1].set_title('Scree Plot')
axes[1].grid(True)

plt.tight_layout()
plt.show()

# ์ฃผ์„ฑ๋ถ„ ๋กœ๋”ฉ (์›๋ž˜ ํ”ผ์ฒ˜์™€์˜ ์ƒ๊ด€๊ด€๊ณ„)
loadings = pca.components_.T * np.sqrt(pca.explained_variance_)
print("\nFeature loadings:")
for i, feature in enumerate(iris.feature_names):
    print(f"{feature:20s}: PC1={loadings[i,0]:6.3f}, PC2={loadings[i,1]:6.3f}")

4.5 ์ฐจ์› ์ถ•์†Œ์™€ ์žฌ๊ตฌ์„ฑ

PCA๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ €์ฐจ์›์œผ๋กœ ์••์ถ•ํ•˜๊ณ  ๋‹ค์‹œ ๋ณต์›ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

# ์ฐจ์› ์ถ•์†Œ์™€ ์žฌ๊ตฌ์„ฑ
pca_full = SklearnPCA(n_components=4)
X_transformed = pca_full.fit_transform(X)

# ๋‹ค์–‘ํ•œ ์„ฑ๋ถ„ ๊ฐœ์ˆ˜๋กœ ์žฌ๊ตฌ์„ฑ
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
axes = axes.ravel()

for idx, n_components in enumerate([1, 2, 3, 4]):
    # ์ฒ˜์Œ n๊ฐœ ์„ฑ๋ถ„๋งŒ ์‚ฌ์šฉ
    X_reduced = X_transformed[:, :n_components]

    # ์žฌ๊ตฌ์„ฑ
    X_reconstructed = pca_full.inverse_transform(
        np.column_stack([X_reduced,
                        np.zeros((len(X), 4-n_components))])
    )

    # ์žฌ๊ตฌ์„ฑ ์˜ค์ฐจ
    reconstruction_error = np.mean((X - X_reconstructed)**2)
    explained_var = pca_full.explained_variance_ratio_[:n_components].sum()

    # ์ฒซ ๋‘ ํ”ผ์ฒ˜๋งŒ ์‹œ๊ฐํ™”
    axes[idx].scatter(X[:, 0], X[:, 1], alpha=0.3, label='Original')
    axes[idx].scatter(X_reconstructed[:, 0], X_reconstructed[:, 1],
                     alpha=0.3, label='Reconstructed')
    axes[idx].set_xlabel('Feature 1')
    axes[idx].set_ylabel('Feature 2')
    axes[idx].set_title(f'{n_components} PCs (Var: {explained_var:.2%}, MSE: {reconstruction_error:.3f})')
    axes[idx].legend()
    axes[idx].grid(True)

plt.tight_layout()
plt.show()

5. ๊ธฐํƒ€ ํ–‰๋ ฌ ๋ถ„ํ•ด

5.1 LU ๋ถ„ํ•ด

์ •๋ฐฉํ–‰๋ ฌ $A$๋ฅผ ํ•˜์‚ผ๊ฐํ–‰๋ ฌ $L$๊ณผ ์ƒ์‚ผ๊ฐํ–‰๋ ฌ $U$์˜ ๊ณฑ์œผ๋กœ ๋ถ„ํ•ด:

$$A = LU$$

์šฉ๋„: ์—ฐ๋ฆฝ๋ฐฉ์ •์‹ $Ax = b$ ํ’€๊ธฐ, ํ–‰๋ ฌ์‹ ๊ณ„์‚ฐ

from scipy.linalg import lu

# LU ๋ถ„ํ•ด
A = np.array([[2, 1, 1],
              [4, -6, 0],
              [-2, 7, 2]])

P, L, U = lu(A)

print(f"P (permutation matrix):\n{P}")
print(f"L (lower triangular):\n{L}")
print(f"U (upper triangular):\n{U}")

# ์žฌ๊ตฌ์„ฑ
A_reconstructed = P @ L @ U
print(f"Reconstruction error: {np.linalg.norm(A - A_reconstructed)}")

# LU ๋ถ„ํ•ด๋ฅผ ์ด์šฉํ•œ ๋ฐฉ์ •์‹ ํ’€์ด
b = np.array([4, 2, 6])

# Ax = b๋ฅผ LUx = b๋กœ ๋ณ€ํ™˜
# 1. Ly = Pb ํ’€๊ธฐ (์ „๋ฐฉ ๋Œ€์ž…)
# 2. Ux = y ํ’€๊ธฐ (ํ›„๋ฐฉ ๋Œ€์ž…)

from scipy.linalg import solve_triangular

y = solve_triangular(L, P @ b, lower=True)
x = solve_triangular(U, y, lower=False)

print(f"Solution: {x}")
print(f"Verification: Ax = {A @ x}")

5.2 QR ๋ถ„ํ•ด

ํ–‰๋ ฌ $A$๋ฅผ ์ง๊ตํ–‰๋ ฌ $Q$์™€ ์ƒ์‚ผ๊ฐํ–‰๋ ฌ $R$์˜ ๊ณฑ์œผ๋กœ ๋ถ„ํ•ด:

$$A = QR$$

์šฉ๋„: ์ตœ์†Œ์ž์Šน ๋ฌธ์ œ, ๊ทธ๋žŒ-์Šˆ๋ฏธํŠธ ์ง๊ตํ™”

# QR ๋ถ„ํ•ด
A = np.array([[1, 1, 0],
              [1, 0, 1],
              [0, 1, 1]])

Q, R = np.linalg.qr(A)

print(f"Q (orthogonal):\n{Q}")
print(f"R (upper triangular):\n{R}")

# ์ง๊ต์„ฑ ํ™•์ธ
print(f"Q^T Q:\n{Q.T @ Q}")

# ์žฌ๊ตฌ์„ฑ
A_reconstructed = Q @ R
print(f"Reconstruction error: {np.linalg.norm(A - A_reconstructed)}")

# ์ตœ์†Œ์ž์Šน ๋ฌธ์ œ ํ’€๊ธฐ: Ax = b (overdetermined)
A_over = np.array([[1, 1],
                   [1, 2],
                   [1, 3],
                   [1, 4]])
b = np.array([2, 3, 5, 6])

Q, R = np.linalg.qr(A_over)
# R์€ ์ฒ˜์Œ 2ํ–‰๋งŒ ์ƒ์‚ผ๊ฐ
R_square = R[:2, :]
Q_reduced = Q[:, :2]

# x = R^(-1) Q^T b
x = np.linalg.solve(R_square, Q_reduced.T @ b)
print(f"Least squares solution: {x}")

5.3 Cholesky ๋ถ„ํ•ด

์–‘์ •์น˜ ๋Œ€์นญํ–‰๋ ฌ $A$๋ฅผ ํ•˜์‚ผ๊ฐํ–‰๋ ฌ $L$๋กœ ๋ถ„ํ•ด:

$$A = LL^T$$

์šฉ๋„: ๊ฐ€์šฐ์‹œ์•ˆ ์ƒ˜ํ”Œ๋ง, ์ˆ˜์น˜์ ์œผ๋กœ ์•ˆ์ •์ ์ธ ๊ณ„์‚ฐ

# Cholesky ๋ถ„ํ•ด
A_pd = np.array([[4, 2, 2],
                 [2, 5, 1],
                 [2, 1, 6]])

# ์–‘์ •์น˜ ํ™•์ธ
eigenvalues = np.linalg.eigvalsh(A_pd)
print(f"Eigenvalues: {eigenvalues}")
print(f"Is positive definite? {np.all(eigenvalues > 0)}")

# Cholesky ๋ถ„ํ•ด
L = np.linalg.cholesky(A_pd)
print(f"L:\n{L}")

# ์žฌ๊ตฌ์„ฑ
A_reconstructed = L @ L.T
print(f"Reconstruction error: {np.linalg.norm(A_pd - A_reconstructed)}")

# ์‘์šฉ: ๋‹ค๋ณ€๋Ÿ‰ ๊ฐ€์šฐ์‹œ์•ˆ ์ƒ˜ํ”Œ๋ง
# N(ฮผ, ฮฃ)์—์„œ ์ƒ˜ํ”Œ๋งํ•˜๋ ค๋ฉด: ฮผ + L @ z (z ~ N(0, I))
mu = np.array([1, 2, 3])
Sigma = A_pd
L = np.linalg.cholesky(Sigma)

# ํ‘œ์ค€ ์ •๊ทœ๋ถ„ํฌ์—์„œ ์ƒ˜ํ”Œ๋ง
n_samples = 1000
Z = np.random.randn(n_samples, 3)

# ๋ณ€ํ™˜
samples = mu + (L @ Z.T).T

# ๊ฒ€์ฆ
print(f"Sample mean: {samples.mean(axis=0)}")
print(f"True mean: {mu}")
print(f"Sample covariance:\n{np.cov(samples.T)}")
print(f"True covariance:\n{Sigma}")

5.4 ๋ถ„ํ•ด ๋ฐฉ๋ฒ• ๋น„๊ต

๋ถ„ํ•ด ์กฐ๊ฑด ๊ณ„์‚ฐ ๋ณต์žก๋„ ์ฃผ์š” ์šฉ๋„
Eigendecomposition ์ •๋ฐฉํ–‰๋ ฌ $O(n^3)$ ํ–‰๋ ฌ ๊ฑฐ๋“ญ์ œ๊ณฑ, ๋™์—ญํ•™ ์‹œ์Šคํ…œ
SVD ๋ชจ๋“  ํ–‰๋ ฌ $O(mn^2)$ ์ €๋žญํฌ ๊ทผ์‚ฌ, PCA, ์ถ”์ฒœ ์‹œ์Šคํ…œ
LU ์ •๋ฐฉํ–‰๋ ฌ $O(n^3)$ ์—ฐ๋ฆฝ๋ฐฉ์ •์‹, ํ–‰๋ ฌ์‹
QR ๋ชจ๋“  ํ–‰๋ ฌ $O(mn^2)$ ์ตœ์†Œ์ž์Šน, ์ง๊ตํ™”
Cholesky ์–‘์ •์น˜ $O(n^3)$ ์ƒ˜ํ”Œ๋ง, ์ˆ˜์น˜ ์•ˆ์ •์„ฑ

6. ML์—์„œ์˜ ํ–‰๋ ฌ ๋ถ„ํ•ด ์‘์šฉ

6.1 ์ถ”์ฒœ ์‹œ์Šคํ…œ: Matrix Factorization

์‚ฌ์šฉ์ž-์•„์ดํ…œ ํ‰์  ํ–‰๋ ฌ $R$์„ ์ €๋žญํฌ๋กœ ๊ทผ์‚ฌ:

$$R \approx UV^T$$

# ๊ฐ„๋‹จํ•œ ์ถ”์ฒœ ์‹œ์Šคํ…œ
from sklearn.decomposition import TruncatedSVD

# ์‚ฌ์šฉ์ž-์˜ํ™” ํ‰์  ํ–‰๋ ฌ (5 users ร— 6 movies)
R = np.array([
    [5, 3, 0, 1, 0, 0],
    [4, 0, 0, 1, 0, 0],
    [1, 1, 0, 5, 0, 0],
    [0, 0, 0, 4, 4, 5],
    [0, 0, 5, 0, 4, 4]
])

# 0์€ ๊ฒฐ์ธก๊ฐ’
mask = R > 0

# SVD (๋žญํฌ 2 ๊ทผ์‚ฌ)
svd = TruncatedSVD(n_components=2)
U = svd.fit_transform(R)
VT = svd.components_

# ์žฌ๊ตฌ์„ฑ
R_approx = U @ VT

print("Original ratings:")
print(R)
print("\nApproximated ratings:")
print(R_approx)

# ๊ฒฐ์ธก๊ฐ’ ์˜ˆ์ธก
print("\nPredicted ratings for missing values:")
for i in range(R.shape[0]):
    for j in range(R.shape[1]):
        if R[i, j] == 0:
            print(f"User {i}, Movie {j}: {R_approx[i, j]:.2f}")

6.2 ์ด๋ฏธ์ง€ ์••์ถ•

SVD๋ฅผ ์ด์šฉํ•œ ์ด๋ฏธ์ง€ ์••์ถ•์€ ์•ž์„œ ๋ณด์•˜์Šต๋‹ˆ๋‹ค. ๊ฐ ์ƒ‰์ƒ ์ฑ„๋„๋งˆ๋‹ค ๋…๋ฆฝ์ ์œผ๋กœ SVD๋ฅผ ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

6.3 ์ŠคํŽ™ํŠธ๋Ÿผ ๊ตฐ์ง‘ํ™” (Spectral Clustering)

๊ทธ๋ž˜ํ”„ ๋ผํ”Œ๋ผ์‹œ์•ˆ ํ–‰๋ ฌ์˜ ๊ณ ์œ ๋ฒกํ„ฐ๋ฅผ ์ด์šฉํ•œ ๊ตฐ์ง‘ํ™”:

from sklearn.cluster import SpectralClustering
from sklearn.datasets import make_moons

# ๋น„์„ ํ˜• ๋ถ„๋ฆฌ ๊ฐ€๋Šฅํ•œ ๋ฐ์ดํ„ฐ
X, y = make_moons(n_samples=200, noise=0.05, random_state=42)

# ์ŠคํŽ™ํŠธ๋Ÿผ ๊ตฐ์ง‘ํ™”
spectral = SpectralClustering(n_clusters=2, affinity='nearest_neighbors',
                               random_state=42)
labels = spectral.fit_predict(X)

# ์‹œ๊ฐํ™”
plt.figure(figsize=(10, 5))

plt.subplot(1, 2, 1)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis')
plt.title('True labels')

plt.subplot(1, 2, 2)
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.title('Spectral clustering')

plt.tight_layout()
plt.show()

6.4 ํŠน์„ฑ ์ถ”์ถœ: Kernel PCA

๋น„์„ ํ˜• ์ฐจ์› ์ถ•์†Œ๋ฅผ ์œ„ํ•ด ์ปค๋„ ํŠธ๋ฆญ๊ณผ PCA๋ฅผ ๊ฒฐํ•ฉ:

from sklearn.decomposition import KernelPCA

# ์›ํ˜• ๋ฐ์ดํ„ฐ
theta = np.linspace(0, 2*np.pi, 100)
r = 1 + 0.3 * np.random.randn(100)
X_circle = np.column_stack([r * np.cos(theta), r * np.sin(theta)])

# ์„ ํ˜• PCA
pca_linear = SklearnPCA(n_components=1)
X_pca_linear = pca_linear.fit_transform(X_circle)

# Kernel PCA (RBF ์ปค๋„)
kpca = KernelPCA(n_components=1, kernel='rbf', gamma=10)
X_kpca = kpca.fit_transform(X_circle)

# ์‹œ๊ฐํ™”
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

axes[0].scatter(X_circle[:, 0], X_circle[:, 1], c=theta, cmap='viridis')
axes[0].set_title('Original data')
axes[0].set_aspect('equal')

axes[1].scatter(X_pca_linear, np.zeros_like(X_pca_linear),
               c=theta, cmap='viridis')
axes[1].set_title('Linear PCA')
axes[1].set_ylim(-0.5, 0.5)

axes[2].scatter(X_kpca, np.zeros_like(X_kpca), c=theta, cmap='viridis')
axes[2].set_title('Kernel PCA')
axes[2].set_ylim(-0.5, 0.5)

plt.tight_layout()
plt.show()

์—ฐ์Šต ๋ฌธ์ œ

๋ฌธ์ œ 1: ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด

๋‹ค์Œ ํ–‰๋ ฌ์˜ ๊ณ ์œ ๊ฐ’๊ณผ ๊ณ ์œ ๋ฒกํ„ฐ๋ฅผ ์†์œผ๋กœ ๊ณ„์‚ฐํ•˜๊ณ , NumPy๋กœ ๊ฒ€์ฆํ•˜์„ธ์š”:

$$A = \begin{bmatrix} 5 & 2 \\ 2 & 2 \end{bmatrix}$$

์ด ํ–‰๋ ฌ์ด ์–‘์ •์น˜์ธ์ง€ ํŒ๋ณ„ํ•˜๊ณ , $A^{10}$์„ ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด๋ฅผ ์ด์šฉํ•ด ๊ณ„์‚ฐํ•˜์„ธ์š”.

๋ฌธ์ œ 2: SVD์™€ ์ €๋žญํฌ ๊ทผ์‚ฌ

$4 \times 3$ ํ–‰๋ ฌ์„ ์ƒ์„ฑํ•˜๊ณ  SVD๋ฅผ ๊ณ„์‚ฐํ•˜์„ธ์š”. ๋žญํฌ 1๊ณผ ๋žญํฌ 2 ๊ทผ์‚ฌ๋ฅผ ๊ตฌํ•˜๊ณ , ๊ฐ๊ฐ์˜ Frobenius ๋…ธ๋ฆ„ ์˜ค์ฐจ๋ฅผ ๊ณ„์‚ฐํ•˜์„ธ์š”.

๋ฌธ์ œ 3: PCA ๊ตฌํ˜„

๋ถ“๊ฝƒ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด: 1. ๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ์˜ ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด๋กœ PCA ์ˆ˜ํ–‰ 2. SVD๋กœ PCA ์ˆ˜ํ–‰ 3. ๋‘ ๋ฐฉ๋ฒ•์˜ ๊ฒฐ๊ณผ๊ฐ€ ๋™์ผํ•œ์ง€ ํ™•์ธ 4. ๋ˆ„์  ์„ค๋ช… ๋ถ„์‚ฐ์ด 95%๊ฐ€ ๋˜๋Š” ์ตœ์†Œ ์„ฑ๋ถ„ ๊ฐœ์ˆ˜๋Š”?

๋ฌธ์ œ 4: Cholesky ๋ถ„ํ•ด์™€ ์ƒ˜ํ”Œ๋ง

๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ $\Sigma = \begin{bmatrix} 4 & 2 \\ 2 & 3 \end{bmatrix}$์™€ ํ‰๊ท  $\mu = [0, 0]^T$๋ฅผ ๊ฐ€์ง€๋Š” 2์ฐจ์› ๊ฐ€์šฐ์‹œ์•ˆ ๋ถ„ํฌ์—์„œ 1000๊ฐœ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•˜์„ธ์š”. Cholesky ๋ถ„ํ•ด๋ฅผ ์ด์šฉํ•˜๊ณ , ์ƒ˜ํ”Œ์˜ ๊ฒฝํ—˜์  ๊ณต๋ถ„์‚ฐ์ด $\Sigma$์™€ ๊ฐ€๊นŒ์šด์ง€ ํ™•์ธํ•˜์„ธ์š”.

๋ฌธ์ œ 5: ์ถ”์ฒœ ์‹œ์Šคํ…œ

๋‹ค์Œ ์‚ฌ์šฉ์ž-์˜ํ™” ํ‰์  ํ–‰๋ ฌ์—์„œ ๊ฒฐ์ธก๊ฐ’์„ SVD๋กœ ์˜ˆ์ธกํ•˜์„ธ์š”:

$$R = \begin{bmatrix} 5 & ? & 4 & ? \\ ? & 3 & ? & 2 \\ 4 & ? & 5 & ? \\ ? & 2 & ? & 3 \end{bmatrix}$$

๋žญํฌ 2 ๊ทผ์‚ฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ , ์˜ˆ์ธก๋œ ํ‰์ ์ด 1-5 ๋ฒ”์œ„์— ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”.

์ฐธ๊ณ  ์ž๋ฃŒ

๊ต์žฌ

  1. Strang, G. (2016). Introduction to Linear Algebra. Chapter 6: Eigenvalues and Eigenvectors.
  2. Trefethen, L. N., & Bau III, D. (1997). Numerical Linear Algebra. SIAM.
  3. SVD์™€ ์ˆ˜์น˜ ์•ˆ์ •์„ฑ์— ๋Œ€ํ•œ ๊นŠ์ด ์žˆ๋Š” ๋…ผ์˜
  4. Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations (4th ed.). Johns Hopkins.
  5. ํ–‰๋ ฌ ๋ถ„ํ•ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋ฐ”์ด๋ธ”

๋…ผ๋ฌธ

  1. Eckart, C., & Young, G. (1936). "The approximation of one matrix by another of lower rank." Psychometrika.
  2. Jolliffe, I. T. (2002). Principal Component Analysis (2nd ed.). Springer.

์˜จ๋ผ์ธ ์ž๋ฃŒ

  1. 3Blue1Brown - Eigenvectors and eigenvalues: https://www.youtube.com/watch?v=PFDu9oVAE-g
  2. Stanford CS229 - Linear Algebra Review: http://cs229.stanford.edu/section/cs229-linalg.pdf
  3. scikit-learn PCA Guide: https://scikit-learn.org/stable/modules/decomposition.html

์‹ค์Šต ๋„๊ตฌ

  1. NumPy Linear Algebra: https://numpy.org/doc/stable/reference/routines.linalg.html
  2. SciPy Linear Algebra: https://docs.scipy.org/doc/scipy/reference/linalg.html

๋‹ค์Œ ๋ ˆ์Šจ: 03. ํ–‰๋ ฌ ๋ฏธ์ ๋ถ„์—์„œ ์—ญ์ „ํŒŒ์˜ ์ˆ˜ํ•™์  ๊ธฐ๋ฐ˜์„ ๋ฐฐ์›๋‹ˆ๋‹ค.

to navigate between lessons