02. ํ๋ ฌ ๋ถํด (Matrix Decompositions)
02. ํ๋ ฌ ๋ถํด (Matrix Decompositions)¶
ํ์ต ๋ชฉํ¶
- ๊ณ ์ ๊ฐ๊ณผ ๊ณ ์ ๋ฒกํฐ์ ๊ฐ๋ ์ ์ดํดํ๊ณ ๊ณ์ฐํ ์ ์๋ค
- ๋์นญ ํ๋ ฌ์ ์คํํธ๋ผ ์ ๋ฆฌ์ ์์ ์น ํ๋ ฌ์ ์ฑ์ง์ ์ค๋ช ํ ์ ์๋ค
- ํน์ด๊ฐ ๋ถํด(SVD)์ ๊ธฐํํ์ ์๋ฏธ๋ฅผ ์ดํดํ๊ณ ๊ตฌํํ ์ ์๋ค
- SVD๋ฅผ ์ด์ฉํ ์ ๋ญํฌ ๊ทผ์ฌ์ PCA์ ์ํ์ ์๋ฆฌ๋ฅผ ์ค๋ช ํ ์ ์๋ค
- LU, QR, Cholesky ๋ถํด์ ์ฉ๋์ ๊ณ์ฐ ๋ฐฉ๋ฒ์ ์ดํดํ๋ค
- ํ๋ ฌ ๋ถํด๊ฐ ๋จธ์ ๋ฌ๋์์ ์ด๋ป๊ฒ ํ์ฉ๋๋์ง ๊ตฌ์ฒด์ ์ธ ์๋ฅผ ๋ค ์ ์๋ค
1. ๊ณ ์ ๊ฐ๊ณผ ๊ณ ์ ๋ฒกํฐ¶
1.1 ์ ์์ ์๋ฏธ¶
ํ๋ ฌ $A$์ ๊ณ ์ ๋ฒกํฐ(eigenvector) $\mathbf{v}$๋ ๋ค์์ ๋ง์กฑํ๋ 0์ด ์๋ ๋ฒกํฐ์ ๋๋ค:
$$A\mathbf{v} = \lambda\mathbf{v}$$
์ฌ๊ธฐ์ $\lambda$๋ ๊ณ ์ ๊ฐ(eigenvalue)์ ๋๋ค.
๊ธฐํํ์ ํด์: ๊ณ ์ ๋ฒกํฐ๋ ์ ํ ๋ณํ $A$์ ์ํด ๋ฐฉํฅ์ ๋ฐ๋์ง ์๊ณ ํฌ๊ธฐ๋ง $\lambda$๋ฐฐ ๋ณํ๋ ํน๋ณํ ๋ฐฉํฅ์ ๋๋ค.
import numpy as np
import matplotlib.pyplot as plt
# 2x2 ํ๋ ฌ์ ๊ณ ์ ๊ฐ๊ณผ ๊ณ ์ ๋ฒกํฐ
A = np.array([[3, 1],
[0, 2]])
eigenvalues, eigenvectors = np.linalg.eig(A)
print(f"Eigenvalues: {eigenvalues}")
print(f"Eigenvectors:\n{eigenvectors}")
# ๊ฒ์ฆ: Av = ฮปv
for i in range(len(eigenvalues)):
v = eigenvectors[:, i]
lam = eigenvalues[i]
Av = A @ v
lam_v = lam * v
print(f"\nEigenvector {i+1}:")
print(f"Av = {Av}")
print(f"ฮปv = {lam_v}")
print(f"Equal? {np.allclose(Av, lam_v)}")
1.2 ํน์ฑ๋ฐฉ์ ์ (Characteristic Equation)¶
๊ณ ์ ๊ฐ์ ํน์ฑ๋ฐฉ์ ์์ ํด์ ๋๋ค:
$$\det(A - \lambda I) = 0$$
์ด๋ $n$์ฐจ ๋คํญ์์ด๋ฉฐ, $n$๊ฐ์ ๊ณ ์ ๊ฐ์ ๊ฐ์ง๋๋ค (์ค๊ทผ ํฌํจ).
# ํน์ฑ๋ฐฉ์ ์ ์๋ ๊ณ์ฐ (2x2 ์์ )
A = np.array([[4, 2],
[1, 3]])
# det(A - ฮปI) = 0
# (4-ฮป)(3-ฮป) - 2*1 = 0
# ฮปยฒ - 7ฮป + 10 = 0
# (ฮป-5)(ฮป-2) = 0
eigenvalues = np.linalg.eigvals(A)
print(f"Eigenvalues: {eigenvalues}") # [5, 2]
# ๊ฒ์ฆ: ํน์ฑ๋ฐฉ์ ์
for lam in eigenvalues:
det = np.linalg.det(A - lam * np.eye(2))
print(f"det(A - {lam}I) = {det:.10f}") # ~0
1.3 ๊ณ ์ ๊ฐ ๋ถํด (Eigendecomposition)¶
$n \times n$ ํ๋ ฌ $A$๊ฐ $n$๊ฐ์ ์ ํ ๋ ๋ฆฝ์ธ ๊ณ ์ ๋ฒกํฐ๋ฅผ ๊ฐ์ง๋ฉด:
$$A = V\Lambda V^{-1}$$
์ฌ๊ธฐ์: - $V$: ๊ณ ์ ๋ฒกํฐ๋ฅผ ์ด๋ก ํ๋ ํ๋ ฌ - $\Lambda$: ๋๊ฐ์ ์ ๊ณ ์ ๊ฐ์ ๊ฐ์ง๋ ๋๊ฐ ํ๋ ฌ
# ๊ณ ์ ๊ฐ ๋ถํด
A = np.array([[3, 1],
[0, 2]])
eigenvalues, V = np.linalg.eig(A)
Lambda = np.diag(eigenvalues)
print(f"V (eigenvectors):\n{V}")
print(f"ฮ (eigenvalues):\n{Lambda}")
# ์ฌ๊ตฌ์ฑ: A = VฮV^(-1)
A_reconstructed = V @ Lambda @ np.linalg.inv(V)
print(f"A reconstructed:\n{A_reconstructed}")
print(f"Original A:\n{A}")
print(f"Equal? {np.allclose(A, A_reconstructed)}")
1.4 ๊ฑฐ๋ญ์ ๊ณฑ๊ณผ ํ๋ ฌ ํจ์¶
๊ณ ์ ๊ฐ ๋ถํด๋ฅผ ์ด์ฉํ๋ฉด ํ๋ ฌ ๊ฑฐ๋ญ์ ๊ณฑ์ด ์ฌ์์ง๋๋ค:
$$A^k = V\Lambda^k V^{-1}$$
ํ๋ ฌ ํจ์๋ ์ ์ํ ์ ์์ต๋๋ค:
$$f(A) = Vf(\Lambda)V^{-1}$$
# ํ๋ ฌ ๊ฑฐ๋ญ์ ๊ณฑ
A = np.array([[2, 1],
[1, 2]])
eigenvalues, V = np.linalg.eig(A)
Lambda = np.diag(eigenvalues)
# A^10 ๊ณ์ฐ (๋ ๊ฐ์ง ๋ฐฉ๋ฒ)
# ๋ฐฉ๋ฒ 1: ์ง์ ๊ณฑ์
A_10_direct = np.linalg.matrix_power(A, 10)
# ๋ฐฉ๋ฒ 2: ๊ณ ์ ๊ฐ ๋ถํด ์ด์ฉ
Lambda_10 = np.diag(eigenvalues**10)
A_10_eigen = V @ Lambda_10 @ np.linalg.inv(V)
print(f"A^10 (direct):\n{A_10_direct}")
print(f"A^10 (eigendecomposition):\n{A_10_eigen.real}")
print(f"Equal? {np.allclose(A_10_direct, A_10_eigen.real)}")
# ํ๋ ฌ ์ง์ ํจ์
from scipy.linalg import expm
exp_A_scipy = expm(A)
exp_Lambda = np.diag(np.exp(eigenvalues))
exp_A_eigen = V @ exp_Lambda @ np.linalg.inv(V)
print(f"exp(A) match? {np.allclose(exp_A_scipy, exp_A_eigen.real)}")
2. ๋์นญ ํ๋ ฌ์ ์คํํธ๋ผ ์ ๋ฆฌ¶
2.1 ์คํํธ๋ผ ์ ๋ฆฌ (Spectral Theorem)¶
์ค์ ๋์นญ ํ๋ ฌ $A = A^T$์ ๋ํด: 1. ๋ชจ๋ ๊ณ ์ ๊ฐ์ด ์ค์ 2. ์๋ก ๋ค๋ฅธ ๊ณ ์ ๊ฐ์ ๋์ํ๋ ๊ณ ์ ๋ฒกํฐ๋ค์ ์ง๊ต 3. ์ง๊ต ๋๊ฐํ ๊ฐ๋ฅ: $A = Q\Lambda Q^T$ (์ฌ๊ธฐ์ $Q$๋ ์ง๊ต ํ๋ ฌ)
# ๋์นญ ํ๋ ฌ์ ๊ณ ์ ๊ฐ ๋ถํด
A_sym = np.array([[4, 2, 0],
[2, 3, 1],
[0, 1, 2]])
# ๋์นญ ํ๋ ฌ์ธ์ง ํ์ธ
print(f"Is symmetric? {np.allclose(A_sym, A_sym.T)}")
eigenvalues, Q = np.linalg.eigh(A_sym) # ๋์นญ ํ๋ ฌ ์ ์ฉ ํจ์
Lambda = np.diag(eigenvalues)
print(f"Eigenvalues: {eigenvalues}")
print(f"Q (orthogonal matrix):\n{Q}")
# ์ง๊ต์ฑ ํ์ธ
print(f"Q^T Q = I?\n{Q.T @ Q}")
print(f"Is orthogonal? {np.allclose(Q.T @ Q, np.eye(3))}")
# ์ฌ๊ตฌ์ฑ: A = QฮQ^T
A_reconstructed = Q @ Lambda @ Q.T
print(f"Reconstruction error: {np.linalg.norm(A_sym - A_reconstructed)}")
2.2 ์์ ์น ํ๋ ฌ (Positive Definite Matrix)¶
๋์นญ ํ๋ ฌ $A$๊ฐ ์์ ์น(positive definite)์ด๋ ค๋ฉด:
$$\mathbf{x}^T A \mathbf{x} > 0 \quad \forall \mathbf{x} \neq \mathbf{0}$$
๋์น ์กฐ๊ฑด: - ๋ชจ๋ ๊ณ ์ ๊ฐ์ด ์์ - ๋ชจ๋ ์ฃผ ์ํ๋ ฌ์(leading principal minor)์ด ์์ - Cholesky ๋ถํด ๊ฐ๋ฅ
์์ ์น ํ๋ ฌ์ ๋จธ์ ๋ฌ๋์์ ๋งค์ฐ ์ค์ํฉ๋๋ค (๊ณต๋ถ์ฐ ํ๋ ฌ, ํค์์ ๋ฑ).
# ์์ ์น ํ๋ ฌ ์์
A_pd = np.array([[2, -1, 0],
[-1, 2, -1],
[0, -1, 2]])
eigenvalues = np.linalg.eigvalsh(A_pd)
print(f"Eigenvalues: {eigenvalues}")
print(f"All positive? {np.all(eigenvalues > 0)}")
# x^T A x > 0 ํ์ธ
x = np.random.randn(3)
quadratic_form = x.T @ A_pd @ x
print(f"x^T A x = {quadratic_form}")
print(f"Positive? {quadratic_form > 0}")
# ๋ฐ์์ ์น(positive semidefinite) ์์
A_psd = np.array([[1, 1],
[1, 1]])
eigenvalues_psd = np.linalg.eigvalsh(A_psd)
print(f"PSD eigenvalues: {eigenvalues_psd}") # [0, 2]
print(f"All non-negative? {np.all(eigenvalues_psd >= 0)}")
2.3 ๋์นญ ํ๋ ฌ์ ๊ธฐํํ์ ํด์¶
๋์นญ ํ๋ ฌ์ ํ์์ฒด(ellipsoid)๋ฅผ ์ ์ํฉ๋๋ค. ๊ณ ์ ๋ฒกํฐ๋ ์ฃผ์ถ(principal axes)์ ๋ฐฉํฅ์ด๊ณ , ๊ณ ์ ๊ฐ์ ๊ฐ ์ถ์ ๊ธธ์ด์ ๋์๋ฉ๋๋ค.
# 2D ํ์ ์๊ฐํ
A = np.array([[3, 1],
[1, 2]])
eigenvalues, eigenvectors = np.linalg.eigh(A)
# ๋จ์์์ A๋ก ๋ณํ
theta = np.linspace(0, 2*np.pi, 100)
unit_circle = np.array([np.cos(theta), np.sin(theta)])
ellipse = A @ unit_circle
# ์๊ฐํ
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))
# ๋จ์์
ax1.plot(unit_circle[0], unit_circle[1], 'b-', label='Unit circle')
ax1.set_aspect('equal')
ax1.grid(True)
ax1.legend()
ax1.set_title('Before transformation')
# ํ์
ax2.plot(ellipse[0], ellipse[1], 'r-', label='Ellipse')
# ๊ณ ์ ๋ฒกํฐ ๊ทธ๋ฆฌ๊ธฐ
for i in range(2):
scale = np.sqrt(eigenvalues[i])
v = eigenvectors[:, i] * scale
ax2.arrow(0, 0, v[0], v[1], head_width=0.2, head_length=0.2,
fc=f'C{i}', ec=f'C{i}', label=f'โฮป{i+1} * v{i+1}')
ax2.set_aspect('equal')
ax2.grid(True)
ax2.legend()
ax2.set_title('After transformation by A')
plt.tight_layout()
plt.show()
3. ํน์ด๊ฐ ๋ถํด (Singular Value Decomposition, SVD)¶
3.1 SVD์ ์ ์¶
๋ชจ๋ $m \times n$ ํ๋ ฌ $A$๋ ๋ค์๊ณผ ๊ฐ์ด ๋ถํด๋ฉ๋๋ค:
$$A = U\Sigma V^T$$
์ฌ๊ธฐ์: - $U$: $m \times m$ ์ง๊ต ํ๋ ฌ (left singular vectors) - $\Sigma$: $m \times n$ ๋๊ฐ ํ๋ ฌ (singular values, $\sigma_1 \geq \sigma_2 \geq \cdots \geq 0$) - $V$: $n \times n$ ์ง๊ต ํ๋ ฌ (right singular vectors)
# SVD ๊ณ์ฐ
A = np.array([[3, 1, 1],
[-1, 3, 1]])
U, sigma, VT = np.linalg.svd(A, full_matrices=True)
print(f"A shape: {A.shape}")
print(f"U shape: {U.shape}")
print(f"sigma: {sigma}") # 1D ๋ฐฐ์ด
print(f"V^T shape: {VT.shape}")
# ฮฃ ํ๋ ฌ ์ฌ๊ตฌ์ฑ (m x n)
Sigma = np.zeros_like(A, dtype=float)
Sigma[:min(A.shape), :min(A.shape)] = np.diag(sigma)
print(f"Sigma shape: {Sigma.shape}")
print(f"Sigma:\n{Sigma}")
# ์ฌ๊ตฌ์ฑ ํ์ธ
A_reconstructed = U @ Sigma @ VT
print(f"Reconstruction error: {np.linalg.norm(A - A_reconstructed)}")
3.2 SVD์ ๊ธฐํํ์ ์๋ฏธ¶
SVD๋ ๋ชจ๋ ์ ํ ๋ณํ์ ์ธ ๋จ๊ณ๋ก ๋ถํดํฉ๋๋ค: 1. $V^T$: ํ์ /๋ฐ์ฌ 2. $\Sigma$: ์ถ ๋ฐฉํฅ ์ค์ผ์ผ๋ง 3. $U$: ํ์ /๋ฐ์ฌ
V^T ฮฃ U
โโโโโโโ โโโโโโโ โโโโโโโ
Rotate โ Scale โ Rotate
# SVD์ ๊ธฐํํ์ ํด์
A = np.array([[3, 1],
[1, 2]])
U, sigma, VT = np.linalg.svd(A)
# ๋จ์์
theta = np.linspace(0, 2*np.pi, 100)
circle = np.array([np.cos(theta), np.sin(theta)])
# ๊ฐ ๋จ๊ณ ์ ์ฉ
step1 = VT @ circle # ์ฒซ ๋ฒ์งธ ํ์
Sigma_2d = np.diag(sigma)
step2 = Sigma_2d @ step1 # ์ค์ผ์ผ๋ง
step3 = U @ step2 # ๋ ๋ฒ์งธ ํ์
# ์ต์ข
๊ฒฐ๊ณผ (ํ๋ฒ์)
final = A @ circle
# ์๊ฐํ
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes[0, 0].plot(circle[0], circle[1], 'b-')
axes[0, 0].set_title('1. Original circle')
axes[0, 0].set_aspect('equal')
axes[0, 0].grid(True)
axes[0, 1].plot(step1[0], step1[1], 'g-')
axes[0, 1].set_title('2. After V^T (rotation)')
axes[0, 1].set_aspect('equal')
axes[0, 1].grid(True)
axes[0, 2].plot(step2[0], step2[1], 'm-')
axes[0, 2].set_title('3. After ฮฃ (scaling)')
axes[0, 2].set_aspect('equal')
axes[0, 2].grid(True)
axes[1, 0].plot(step3[0], step3[1], 'r-')
axes[1, 0].set_title('4. After U (rotation)')
axes[1, 0].set_aspect('equal')
axes[1, 0].grid(True)
axes[1, 1].plot(final[0], final[1], 'k-', linewidth=2)
axes[1, 1].set_title('5. Final: A @ circle')
axes[1, 1].set_aspect('equal')
axes[1, 1].grid(True)
# ๊ฒ์ฆ
axes[1, 2].plot(step3[0], step3[1], 'r-', alpha=0.5, label='SVD steps')
axes[1, 2].plot(final[0], final[1], 'k--', label='Direct')
axes[1, 2].set_title('Verification')
axes[1, 2].set_aspect('equal')
axes[1, 2].grid(True)
axes[1, 2].legend()
plt.tight_layout()
plt.show()
3.3 SVD์ ๊ณ ์ ๊ฐ ๋ถํด์ ๊ด๊ณ¶
$A^TA$์ $AA^T$์ ๊ณ ์ ๊ฐ ๋ถํด๋ก๋ถํฐ SVD๋ฅผ ์ป์ ์ ์์ต๋๋ค:
$$A^TA = V\Sigma^T\Sigma V^T = V\Sigma^2 V^T$$ $$AA^T = U\Sigma\Sigma^T U^T = U\Sigma^2 U^T$$
์ฆ: - $V$๋ $A^TA$์ ๊ณ ์ ๋ฒกํฐ - $U$๋ $AA^T$์ ๊ณ ์ ๋ฒกํฐ - $\sigma_i^2$๋ $A^TA$ (๋๋ $AA^T$)์ ๊ณ ์ ๊ฐ
# SVD์ ๊ณ ์ ๊ฐ ๋ถํด์ ๊ด๊ณ
A = np.array([[3, 1, 1],
[-1, 3, 1]])
# ๋ฐฉ๋ฒ 1: ์ง์ SVD
U_svd, sigma_svd, VT_svd = np.linalg.svd(A)
# ๋ฐฉ๋ฒ 2: A^T A์ ๊ณ ์ ๊ฐ ๋ถํด
ATA = A.T @ A
eigenvalues_ATA, V_eigen = np.linalg.eigh(ATA)
# ๊ณ ์ ๊ฐ์ ๋ด๋ฆผ์ฐจ์์ผ๋ก ์ ๋ ฌ
idx = eigenvalues_ATA.argsort()[::-1]
eigenvalues_ATA = eigenvalues_ATA[idx]
V_eigen = V_eigen[:, idx]
sigma_from_eigen = np.sqrt(eigenvalues_ATA)
print(f"Singular values (SVD): {sigma_svd}")
print(f"sqrt(eigenvalues of A^T A): {sigma_from_eigen}")
print(f"V from SVD:\n{VT_svd.T}")
print(f"V from eigendecomposition:\n{V_eigen}")
3.4 ์ ๋ญํฌ ๊ทผ์ฌ (Low-Rank Approximation)¶
Eckart-Young-Mirsky ์ ๋ฆฌ: ๋ญํฌ $k$ ๊ทผ์ฌ ์ค Frobenius ๋ ธ๋ฆ ์๋ฏธ์์ ์ต์ ์ ๊ทผ์ฌ๋:
$$A_k = \sum_{i=1}^k \sigma_i \mathbf{u}_i \mathbf{v}_i^T = U_k \Sigma_k V_k^T$$
# ์ ๋ญํฌ ๊ทผ์ฌ ์์ : ์ด๋ฏธ์ง ์์ถ
from sklearn.datasets import load_sample_image
import matplotlib.pyplot as plt
# ์ํ ์ด๋ฏธ์ง ๋ก๋
china = load_sample_image("china.jpg")
# ๊ทธ๋ ์ด์ค์ผ์ผ๋ก ๋ณํ
china_gray = china.mean(axis=2)
# SVD
U, sigma, VT = np.linalg.svd(china_gray, full_matrices=False)
# ๋ค์ํ ๋ญํฌ๋ก ๊ทผ์ฌ
ranks = [5, 20, 50, 100]
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.ravel()
axes[0].imshow(china_gray, cmap='gray')
axes[0].set_title(f'Original (rank={np.linalg.matrix_rank(china_gray)})')
axes[0].axis('off')
for i, k in enumerate(ranks, 1):
# ๋ญํฌ k ๊ทผ์ฌ
A_k = U[:, :k] @ np.diag(sigma[:k]) @ VT[:k, :]
axes[i].imshow(A_k, cmap='gray')
# ์์ถ๋ฅ ๊ณ์ฐ
original_size = china_gray.size
compressed_size = k * (U.shape[0] + VT.shape[1] + 1)
compression_ratio = original_size / compressed_size
axes[i].set_title(f'Rank {k} (compression: {compression_ratio:.1f}x)')
axes[i].axis('off')
# ํน์ด๊ฐ ๋ถํฌ
axes[-1].plot(sigma, 'b-')
axes[-1].set_xlabel('Index')
axes[-1].set_ylabel('Singular value')
axes[-1].set_title('Singular value spectrum')
axes[-1].set_yscale('log')
axes[-1].grid(True)
plt.tight_layout()
plt.show()
4. PCA: SVD์์ ์ฃผ์ฑ๋ถ ๋ถ์์ผ๋ก¶
4.1 ๊ณต๋ถ์ฐ ํ๋ ฌ (Covariance Matrix)¶
๋ฐ์ดํฐ ํ๋ ฌ $X$ ($n$ ์ํ ร $d$ ํผ์ฒ)์ ๋ํด, ๊ณต๋ถ์ฐ ํ๋ ฌ์:
$$C = \frac{1}{n-1}X^TX \quad \text{(๋ฐ์ดํฐ๊ฐ ์ค์ฌํ๋์ด ์๋ค๊ณ ๊ฐ์ )}$$
๊ณต๋ถ์ฐ ํ๋ ฌ์ ๋์นญ์ด๊ณ ๋ฐ์์ ์น์ ๋๋ค.
# ๊ณต๋ถ์ฐ ํ๋ ฌ
np.random.seed(42)
X = np.random.randn(100, 3)
# ๋ฐ์ดํฐ ์ค์ฌํ
X_centered = X - X.mean(axis=0)
# ๊ณต๋ถ์ฐ ํ๋ ฌ (๋ ๊ฐ์ง ๋ฐฉ๋ฒ)
cov_manual = (X_centered.T @ X_centered) / (len(X) - 1)
cov_numpy = np.cov(X.T)
print(f"Manual covariance:\n{cov_manual}")
print(f"NumPy covariance:\n{cov_numpy}")
print(f"Equal? {np.allclose(cov_manual, cov_numpy)}")
# ๋์นญ์ฑ ํ์ธ
print(f"Is symmetric? {np.allclose(cov_manual, cov_manual.T)}")
4.2 PCA์ ์ํ์ ์ ๋¶
๋ชฉํ: ๋ฐ์ดํฐ์ ๋ถ์ฐ์ ์ต๋ํํ๋ ๋ฐฉํฅ(์ฃผ์ฑ๋ถ)์ ์ฐพ๊ธฐ
์ฒซ ๋ฒ์งธ ์ฃผ์ฑ๋ถ $\mathbf{w}_1$์ ๋ค์ ๋ฌธ์ ์ ํด:
$$\max_{\mathbf{w}} \mathbf{w}^T C \mathbf{w} \quad \text{subject to } \|\mathbf{w}\| = 1$$
ํด: ๊ณต๋ถ์ฐ ํ๋ ฌ $C$์ ๊ฐ์ฅ ํฐ ๊ณ ์ ๊ฐ์ ๋์ํ๋ ๊ณ ์ ๋ฒกํฐ
$k$๋ฒ์งธ ์ฃผ์ฑ๋ถ์ ์ด์ $k-1$๊ฐ์ ์ง๊ตํ๋ฉด์ ๋ถ์ฐ์ ์ต๋ํํ๋ ๋ฐฉํฅ์ ๋๋ค.
# PCA ์๋ ๊ตฌํ
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
# 1. ๋ฐ์ดํฐ ์ค์ฌํ
X_centered = X - X.mean(axis=0)
# 2. ๊ณต๋ถ์ฐ ํ๋ ฌ
cov = np.cov(X_centered.T)
# 3. ๊ณ ์ ๊ฐ ๋ถํด
eigenvalues, eigenvectors = np.linalg.eigh(cov)
# 4. ๋ด๋ฆผ์ฐจ์ ์ ๋ ฌ
idx = eigenvalues.argsort()[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]
print(f"Eigenvalues (variances): {eigenvalues}")
print(f"Principal components (eigenvectors):\n{eigenvectors}")
# 5. ๋ณํ (์ฒ์ 2๊ฐ ์ฃผ์ฑ๋ถ์ผ๋ก ํฌ์)
k = 2
X_pca = X_centered @ eigenvectors[:, :k]
print(f"Transformed data shape: {X_pca.shape}")
4.3 SVD๋ฅผ ์ด์ฉํ PCA¶
๊ณต๋ถ์ฐ ํ๋ ฌ์ ๋ช ์์ ์ผ๋ก ๊ณ์ฐํ์ง ์๊ณ SVD๋ฅผ ์ด์ฉํ ์ ์์ต๋๋ค:
$$X = U\Sigma V^T \quad \Rightarrow \quad X^TX = V\Sigma^2 V^T$$
๋ฐ๋ผ์ $V$์ ์ด๋ฒกํฐ๋ค์ด ์ฃผ์ฑ๋ถ์ด๊ณ , $\sigma_i^2/(n-1)$์ด ๋ถ์ฐ์ ๋๋ค.
# SVD๋ฅผ ์ด์ฉํ PCA
X_centered = X - X.mean(axis=0)
# SVD
U, sigma, VT = np.linalg.svd(X_centered, full_matrices=False)
# ์ฃผ์ฑ๋ถ = V์ ์ด๋ฒกํฐ
principal_components = VT.T
print(f"Principal components (from SVD):\n{principal_components}")
# ์ค๋ช
๋ ๋ถ์ฐ
explained_variance = (sigma**2) / (len(X) - 1)
print(f"Explained variance: {explained_variance}")
# ๋ถ์ฐ ๋น์จ
explained_variance_ratio = explained_variance / explained_variance.sum()
print(f"Explained variance ratio: {explained_variance_ratio}")
# ๋ณํ
X_pca_svd = X_centered @ principal_components[:, :2]
print(f"Same as eigendecomposition? {np.allclose(np.abs(X_pca), np.abs(X_pca_svd))}")
4.4 PCA ์๊ฐํ์ ํด์¶
from sklearn.decomposition import PCA as SklearnPCA
# scikit-learn PCA
pca = SklearnPCA(n_components=2)
X_sklearn = pca.fit_transform(X)
# ์๊ฐํ
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
# ๋ณํ๋ ๋ฐ์ดํฐ
for i, target_name in enumerate(iris.target_names):
mask = iris.target == i
axes[0].scatter(X_sklearn[mask, 0], X_sklearn[mask, 1],
label=target_name, alpha=0.7)
axes[0].set_xlabel(f'PC1 ({pca.explained_variance_ratio_[0]:.2%})')
axes[0].set_ylabel(f'PC2 ({pca.explained_variance_ratio_[1]:.2%})')
axes[0].set_title('PCA of Iris Dataset')
axes[0].legend()
axes[0].grid(True)
# ์ค๋ช
๋ ๋ถ์ฐ
axes[1].bar(range(1, len(pca.explained_variance_ratio_)+1),
pca.explained_variance_ratio_)
axes[1].set_xlabel('Principal Component')
axes[1].set_ylabel('Explained Variance Ratio')
axes[1].set_title('Scree Plot')
axes[1].grid(True)
plt.tight_layout()
plt.show()
# ์ฃผ์ฑ๋ถ ๋ก๋ฉ (์๋ ํผ์ฒ์์ ์๊ด๊ด๊ณ)
loadings = pca.components_.T * np.sqrt(pca.explained_variance_)
print("\nFeature loadings:")
for i, feature in enumerate(iris.feature_names):
print(f"{feature:20s}: PC1={loadings[i,0]:6.3f}, PC2={loadings[i,1]:6.3f}")
4.5 ์ฐจ์ ์ถ์์ ์ฌ๊ตฌ์ฑ¶
PCA๋ ๋ฐ์ดํฐ๋ฅผ ์ ์ฐจ์์ผ๋ก ์์ถํ๊ณ ๋ค์ ๋ณต์ํ ์ ์์ต๋๋ค.
# ์ฐจ์ ์ถ์์ ์ฌ๊ตฌ์ฑ
pca_full = SklearnPCA(n_components=4)
X_transformed = pca_full.fit_transform(X)
# ๋ค์ํ ์ฑ๋ถ ๊ฐ์๋ก ์ฌ๊ตฌ์ฑ
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
axes = axes.ravel()
for idx, n_components in enumerate([1, 2, 3, 4]):
# ์ฒ์ n๊ฐ ์ฑ๋ถ๋ง ์ฌ์ฉ
X_reduced = X_transformed[:, :n_components]
# ์ฌ๊ตฌ์ฑ
X_reconstructed = pca_full.inverse_transform(
np.column_stack([X_reduced,
np.zeros((len(X), 4-n_components))])
)
# ์ฌ๊ตฌ์ฑ ์ค์ฐจ
reconstruction_error = np.mean((X - X_reconstructed)**2)
explained_var = pca_full.explained_variance_ratio_[:n_components].sum()
# ์ฒซ ๋ ํผ์ฒ๋ง ์๊ฐํ
axes[idx].scatter(X[:, 0], X[:, 1], alpha=0.3, label='Original')
axes[idx].scatter(X_reconstructed[:, 0], X_reconstructed[:, 1],
alpha=0.3, label='Reconstructed')
axes[idx].set_xlabel('Feature 1')
axes[idx].set_ylabel('Feature 2')
axes[idx].set_title(f'{n_components} PCs (Var: {explained_var:.2%}, MSE: {reconstruction_error:.3f})')
axes[idx].legend()
axes[idx].grid(True)
plt.tight_layout()
plt.show()
5. ๊ธฐํ ํ๋ ฌ ๋ถํด¶
5.1 LU ๋ถํด¶
์ ๋ฐฉํ๋ ฌ $A$๋ฅผ ํ์ผ๊ฐํ๋ ฌ $L$๊ณผ ์์ผ๊ฐํ๋ ฌ $U$์ ๊ณฑ์ผ๋ก ๋ถํด:
$$A = LU$$
์ฉ๋: ์ฐ๋ฆฝ๋ฐฉ์ ์ $Ax = b$ ํ๊ธฐ, ํ๋ ฌ์ ๊ณ์ฐ
from scipy.linalg import lu
# LU ๋ถํด
A = np.array([[2, 1, 1],
[4, -6, 0],
[-2, 7, 2]])
P, L, U = lu(A)
print(f"P (permutation matrix):\n{P}")
print(f"L (lower triangular):\n{L}")
print(f"U (upper triangular):\n{U}")
# ์ฌ๊ตฌ์ฑ
A_reconstructed = P @ L @ U
print(f"Reconstruction error: {np.linalg.norm(A - A_reconstructed)}")
# LU ๋ถํด๋ฅผ ์ด์ฉํ ๋ฐฉ์ ์ ํ์ด
b = np.array([4, 2, 6])
# Ax = b๋ฅผ LUx = b๋ก ๋ณํ
# 1. Ly = Pb ํ๊ธฐ (์ ๋ฐฉ ๋์
)
# 2. Ux = y ํ๊ธฐ (ํ๋ฐฉ ๋์
)
from scipy.linalg import solve_triangular
y = solve_triangular(L, P @ b, lower=True)
x = solve_triangular(U, y, lower=False)
print(f"Solution: {x}")
print(f"Verification: Ax = {A @ x}")
5.2 QR ๋ถํด¶
ํ๋ ฌ $A$๋ฅผ ์ง๊ตํ๋ ฌ $Q$์ ์์ผ๊ฐํ๋ ฌ $R$์ ๊ณฑ์ผ๋ก ๋ถํด:
$$A = QR$$
์ฉ๋: ์ต์์์น ๋ฌธ์ , ๊ทธ๋-์๋ฏธํธ ์ง๊ตํ
# QR ๋ถํด
A = np.array([[1, 1, 0],
[1, 0, 1],
[0, 1, 1]])
Q, R = np.linalg.qr(A)
print(f"Q (orthogonal):\n{Q}")
print(f"R (upper triangular):\n{R}")
# ์ง๊ต์ฑ ํ์ธ
print(f"Q^T Q:\n{Q.T @ Q}")
# ์ฌ๊ตฌ์ฑ
A_reconstructed = Q @ R
print(f"Reconstruction error: {np.linalg.norm(A - A_reconstructed)}")
# ์ต์์์น ๋ฌธ์ ํ๊ธฐ: Ax = b (overdetermined)
A_over = np.array([[1, 1],
[1, 2],
[1, 3],
[1, 4]])
b = np.array([2, 3, 5, 6])
Q, R = np.linalg.qr(A_over)
# R์ ์ฒ์ 2ํ๋ง ์์ผ๊ฐ
R_square = R[:2, :]
Q_reduced = Q[:, :2]
# x = R^(-1) Q^T b
x = np.linalg.solve(R_square, Q_reduced.T @ b)
print(f"Least squares solution: {x}")
5.3 Cholesky ๋ถํด¶
์์ ์น ๋์นญํ๋ ฌ $A$๋ฅผ ํ์ผ๊ฐํ๋ ฌ $L$๋ก ๋ถํด:
$$A = LL^T$$
์ฉ๋: ๊ฐ์ฐ์์ ์ํ๋ง, ์์น์ ์ผ๋ก ์์ ์ ์ธ ๊ณ์ฐ
# Cholesky ๋ถํด
A_pd = np.array([[4, 2, 2],
[2, 5, 1],
[2, 1, 6]])
# ์์ ์น ํ์ธ
eigenvalues = np.linalg.eigvalsh(A_pd)
print(f"Eigenvalues: {eigenvalues}")
print(f"Is positive definite? {np.all(eigenvalues > 0)}")
# Cholesky ๋ถํด
L = np.linalg.cholesky(A_pd)
print(f"L:\n{L}")
# ์ฌ๊ตฌ์ฑ
A_reconstructed = L @ L.T
print(f"Reconstruction error: {np.linalg.norm(A_pd - A_reconstructed)}")
# ์์ฉ: ๋ค๋ณ๋ ๊ฐ์ฐ์์ ์ํ๋ง
# N(ฮผ, ฮฃ)์์ ์ํ๋งํ๋ ค๋ฉด: ฮผ + L @ z (z ~ N(0, I))
mu = np.array([1, 2, 3])
Sigma = A_pd
L = np.linalg.cholesky(Sigma)
# ํ์ค ์ ๊ท๋ถํฌ์์ ์ํ๋ง
n_samples = 1000
Z = np.random.randn(n_samples, 3)
# ๋ณํ
samples = mu + (L @ Z.T).T
# ๊ฒ์ฆ
print(f"Sample mean: {samples.mean(axis=0)}")
print(f"True mean: {mu}")
print(f"Sample covariance:\n{np.cov(samples.T)}")
print(f"True covariance:\n{Sigma}")
5.4 ๋ถํด ๋ฐฉ๋ฒ ๋น๊ต¶
| ๋ถํด | ์กฐ๊ฑด | ๊ณ์ฐ ๋ณต์ก๋ | ์ฃผ์ ์ฉ๋ |
|---|---|---|---|
| Eigendecomposition | ์ ๋ฐฉํ๋ ฌ | $O(n^3)$ | ํ๋ ฌ ๊ฑฐ๋ญ์ ๊ณฑ, ๋์ญํ ์์คํ |
| SVD | ๋ชจ๋ ํ๋ ฌ | $O(mn^2)$ | ์ ๋ญํฌ ๊ทผ์ฌ, PCA, ์ถ์ฒ ์์คํ |
| LU | ์ ๋ฐฉํ๋ ฌ | $O(n^3)$ | ์ฐ๋ฆฝ๋ฐฉ์ ์, ํ๋ ฌ์ |
| QR | ๋ชจ๋ ํ๋ ฌ | $O(mn^2)$ | ์ต์์์น, ์ง๊ตํ |
| Cholesky | ์์ ์น | $O(n^3)$ | ์ํ๋ง, ์์น ์์ ์ฑ |
6. ML์์์ ํ๋ ฌ ๋ถํด ์์ฉ¶
6.1 ์ถ์ฒ ์์คํ : Matrix Factorization¶
์ฌ์ฉ์-์์ดํ ํ์ ํ๋ ฌ $R$์ ์ ๋ญํฌ๋ก ๊ทผ์ฌ:
$$R \approx UV^T$$
# ๊ฐ๋จํ ์ถ์ฒ ์์คํ
from sklearn.decomposition import TruncatedSVD
# ์ฌ์ฉ์-์ํ ํ์ ํ๋ ฌ (5 users ร 6 movies)
R = np.array([
[5, 3, 0, 1, 0, 0],
[4, 0, 0, 1, 0, 0],
[1, 1, 0, 5, 0, 0],
[0, 0, 0, 4, 4, 5],
[0, 0, 5, 0, 4, 4]
])
# 0์ ๊ฒฐ์ธก๊ฐ
mask = R > 0
# SVD (๋ญํฌ 2 ๊ทผ์ฌ)
svd = TruncatedSVD(n_components=2)
U = svd.fit_transform(R)
VT = svd.components_
# ์ฌ๊ตฌ์ฑ
R_approx = U @ VT
print("Original ratings:")
print(R)
print("\nApproximated ratings:")
print(R_approx)
# ๊ฒฐ์ธก๊ฐ ์์ธก
print("\nPredicted ratings for missing values:")
for i in range(R.shape[0]):
for j in range(R.shape[1]):
if R[i, j] == 0:
print(f"User {i}, Movie {j}: {R_approx[i, j]:.2f}")
6.2 ์ด๋ฏธ์ง ์์ถ¶
SVD๋ฅผ ์ด์ฉํ ์ด๋ฏธ์ง ์์ถ์ ์์ ๋ณด์์ต๋๋ค. ๊ฐ ์์ ์ฑ๋๋ง๋ค ๋ ๋ฆฝ์ ์ผ๋ก SVD๋ฅผ ์ ์ฉํ ์ ์์ต๋๋ค.
6.3 ์คํํธ๋ผ ๊ตฐ์งํ (Spectral Clustering)¶
๊ทธ๋ํ ๋ผํ๋ผ์์ ํ๋ ฌ์ ๊ณ ์ ๋ฒกํฐ๋ฅผ ์ด์ฉํ ๊ตฐ์งํ:
from sklearn.cluster import SpectralClustering
from sklearn.datasets import make_moons
# ๋น์ ํ ๋ถ๋ฆฌ ๊ฐ๋ฅํ ๋ฐ์ดํฐ
X, y = make_moons(n_samples=200, noise=0.05, random_state=42)
# ์คํํธ๋ผ ๊ตฐ์งํ
spectral = SpectralClustering(n_clusters=2, affinity='nearest_neighbors',
random_state=42)
labels = spectral.fit_predict(X)
# ์๊ฐํ
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis')
plt.title('True labels')
plt.subplot(1, 2, 2)
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.title('Spectral clustering')
plt.tight_layout()
plt.show()
6.4 ํน์ฑ ์ถ์ถ: Kernel PCA¶
๋น์ ํ ์ฐจ์ ์ถ์๋ฅผ ์ํด ์ปค๋ ํธ๋ฆญ๊ณผ PCA๋ฅผ ๊ฒฐํฉ:
from sklearn.decomposition import KernelPCA
# ์ํ ๋ฐ์ดํฐ
theta = np.linspace(0, 2*np.pi, 100)
r = 1 + 0.3 * np.random.randn(100)
X_circle = np.column_stack([r * np.cos(theta), r * np.sin(theta)])
# ์ ํ PCA
pca_linear = SklearnPCA(n_components=1)
X_pca_linear = pca_linear.fit_transform(X_circle)
# Kernel PCA (RBF ์ปค๋)
kpca = KernelPCA(n_components=1, kernel='rbf', gamma=10)
X_kpca = kpca.fit_transform(X_circle)
# ์๊ฐํ
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
axes[0].scatter(X_circle[:, 0], X_circle[:, 1], c=theta, cmap='viridis')
axes[0].set_title('Original data')
axes[0].set_aspect('equal')
axes[1].scatter(X_pca_linear, np.zeros_like(X_pca_linear),
c=theta, cmap='viridis')
axes[1].set_title('Linear PCA')
axes[1].set_ylim(-0.5, 0.5)
axes[2].scatter(X_kpca, np.zeros_like(X_kpca), c=theta, cmap='viridis')
axes[2].set_title('Kernel PCA')
axes[2].set_ylim(-0.5, 0.5)
plt.tight_layout()
plt.show()
์ฐ์ต ๋ฌธ์ ¶
๋ฌธ์ 1: ๊ณ ์ ๊ฐ ๋ถํด¶
๋ค์ ํ๋ ฌ์ ๊ณ ์ ๊ฐ๊ณผ ๊ณ ์ ๋ฒกํฐ๋ฅผ ์์ผ๋ก ๊ณ์ฐํ๊ณ , NumPy๋ก ๊ฒ์ฆํ์ธ์:
$$A = \begin{bmatrix} 5 & 2 \\ 2 & 2 \end{bmatrix}$$
์ด ํ๋ ฌ์ด ์์ ์น์ธ์ง ํ๋ณํ๊ณ , $A^{10}$์ ๊ณ ์ ๊ฐ ๋ถํด๋ฅผ ์ด์ฉํด ๊ณ์ฐํ์ธ์.
๋ฌธ์ 2: SVD์ ์ ๋ญํฌ ๊ทผ์ฌ¶
$4 \times 3$ ํ๋ ฌ์ ์์ฑํ๊ณ SVD๋ฅผ ๊ณ์ฐํ์ธ์. ๋ญํฌ 1๊ณผ ๋ญํฌ 2 ๊ทผ์ฌ๋ฅผ ๊ตฌํ๊ณ , ๊ฐ๊ฐ์ Frobenius ๋ ธ๋ฆ ์ค์ฐจ๋ฅผ ๊ณ์ฐํ์ธ์.
๋ฌธ์ 3: PCA ๊ตฌํ¶
๋ถ๊ฝ ๋ฐ์ดํฐ์ ์ ๋ํด: 1. ๊ณต๋ถ์ฐ ํ๋ ฌ์ ๊ณ ์ ๊ฐ ๋ถํด๋ก PCA ์ํ 2. SVD๋ก PCA ์ํ 3. ๋ ๋ฐฉ๋ฒ์ ๊ฒฐ๊ณผ๊ฐ ๋์ผํ์ง ํ์ธ 4. ๋์ ์ค๋ช ๋ถ์ฐ์ด 95%๊ฐ ๋๋ ์ต์ ์ฑ๋ถ ๊ฐ์๋?
๋ฌธ์ 4: Cholesky ๋ถํด์ ์ํ๋ง¶
๊ณต๋ถ์ฐ ํ๋ ฌ $\Sigma = \begin{bmatrix} 4 & 2 \\ 2 & 3 \end{bmatrix}$์ ํ๊ท $\mu = [0, 0]^T$๋ฅผ ๊ฐ์ง๋ 2์ฐจ์ ๊ฐ์ฐ์์ ๋ถํฌ์์ 1000๊ฐ ์ํ์ ์์ฑํ์ธ์. Cholesky ๋ถํด๋ฅผ ์ด์ฉํ๊ณ , ์ํ์ ๊ฒฝํ์ ๊ณต๋ถ์ฐ์ด $\Sigma$์ ๊ฐ๊น์ด์ง ํ์ธํ์ธ์.
๋ฌธ์ 5: ์ถ์ฒ ์์คํ ¶
๋ค์ ์ฌ์ฉ์-์ํ ํ์ ํ๋ ฌ์์ ๊ฒฐ์ธก๊ฐ์ SVD๋ก ์์ธกํ์ธ์:
$$R = \begin{bmatrix} 5 & ? & 4 & ? \\ ? & 3 & ? & 2 \\ 4 & ? & 5 & ? \\ ? & 2 & ? & 3 \end{bmatrix}$$
๋ญํฌ 2 ๊ทผ์ฌ๋ฅผ ์ฌ์ฉํ๊ณ , ์์ธก๋ ํ์ ์ด 1-5 ๋ฒ์์ ์๋์ง ํ์ธํ์ธ์.
์ฐธ๊ณ ์๋ฃ¶
๊ต์ฌ¶
- Strang, G. (2016). Introduction to Linear Algebra. Chapter 6: Eigenvalues and Eigenvectors.
- Trefethen, L. N., & Bau III, D. (1997). Numerical Linear Algebra. SIAM.
- SVD์ ์์น ์์ ์ฑ์ ๋ํ ๊น์ด ์๋ ๋ ผ์
- Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations (4th ed.). Johns Hopkins.
- ํ๋ ฌ ๋ถํด ์๊ณ ๋ฆฌ์ฆ์ ๋ฐ์ด๋ธ
๋ ผ๋ฌธ¶
- Eckart, C., & Young, G. (1936). "The approximation of one matrix by another of lower rank." Psychometrika.
- Jolliffe, I. T. (2002). Principal Component Analysis (2nd ed.). Springer.
์จ๋ผ์ธ ์๋ฃ¶
- 3Blue1Brown - Eigenvectors and eigenvalues: https://www.youtube.com/watch?v=PFDu9oVAE-g
- Stanford CS229 - Linear Algebra Review: http://cs229.stanford.edu/section/cs229-linalg.pdf
- scikit-learn PCA Guide: https://scikit-learn.org/stable/modules/decomposition.html
์ค์ต ๋๊ตฌ¶
- NumPy Linear Algebra: https://numpy.org/doc/stable/reference/routines.linalg.html
- SciPy Linear Algebra: https://docs.scipy.org/doc/scipy/reference/linalg.html
๋ค์ ๋ ์จ: 03. ํ๋ ฌ ๋ฏธ์ ๋ถ์์ ์ญ์ ํ์ ์ํ์ ๊ธฐ๋ฐ์ ๋ฐฐ์๋๋ค.