03_model_evaluation.ipynb

  1{
  2 "cells": [
  3  {
  4   "cell_type": "markdown",
  5   "metadata": {},
  6   "source": [
  7    "# 모델 평가 (Model Evaluation)\n",
  8    "\n",
  9    "이 노트북에서는 머신러닝 모델의 성능을 평가하는 다양한 지표와 방법을 학습합니다.\n",
 10    "\n",
 11    "## 목차\n",
 12    "1. 분류 평가 지표\n",
 13    "   - 혼동 행렬 (Confusion Matrix)\n",
 14    "   - 정확도, 정밀도, 재현율, F1-score\n",
 15    "   - ROC 곡선과 AUC\n",
 16    "   - Precision-Recall 곡선\n",
 17    "2. 다중 분류 평가\n",
 18    "3. 회귀 평가 지표\n",
 19    "4. 학습 곡선"
 20   ]
 21  },
 22  {
 23   "cell_type": "code",
 24   "execution_count": null,
 25   "metadata": {},
 26   "outputs": [],
 27   "source": [
 28    "# 필요한 라이브러리 임포트\n",
 29    "import numpy as np\n",
 30    "import matplotlib.pyplot as plt\n",
 31    "import seaborn as sns\n",
 32    "from sklearn.datasets import load_breast_cancer, load_iris, load_diabetes\n",
 33    "from sklearn.model_selection import train_test_split, learning_curve\n",
 34    "from sklearn.linear_model import LogisticRegression, LinearRegression\n",
 35    "from sklearn.metrics import (\n",
 36    "    confusion_matrix, ConfusionMatrixDisplay,\n",
 37    "    accuracy_score, precision_score, recall_score, f1_score,\n",
 38    "    classification_report,\n",
 39    "    roc_curve, roc_auc_score, auc,\n",
 40    "    precision_recall_curve, average_precision_score,\n",
 41    "    mean_absolute_error, mean_squared_error, r2_score\n",
 42    ")\n",
 43    "from sklearn.preprocessing import label_binarize\n",
 44    "\n",
 45    "# 시각화 설정\n",
 46    "plt.rcParams['figure.figsize'] = (10, 6)\n",
 47    "plt.rcParams['font.family'] = 'AppleGothic'  # MacOS용 한글 폰트\n",
 48    "plt.rcParams['axes.unicode_minus'] = False\n",
 49    "sns.set_style('whitegrid')\n",
 50    "\n",
 51    "# 경고 무시\n",
 52    "import warnings\n",
 53    "warnings.filterwarnings('ignore')"
 54   ]
 55  },
 56  {
 57   "cell_type": "markdown",
 58   "metadata": {},
 59   "source": [
 60    "## 1. 분류 평가 지표\n",
 61    "\n",
 62    "### 1.1 혼동 행렬 (Confusion Matrix)"
 63   ]
 64  },
 65  {
 66   "cell_type": "code",
 67   "execution_count": null,
 68   "metadata": {},
 69   "outputs": [],
 70   "source": [
 71    "# 간단한 예시 데이터\n",
 72    "y_true = np.array([1, 0, 1, 1, 0, 1, 0, 0, 1, 1])\n",
 73    "y_pred = np.array([1, 0, 1, 0, 0, 1, 1, 0, 1, 1])\n",
 74    "\n",
 75    "# 혼동 행렬 계산\n",
 76    "cm = confusion_matrix(y_true, y_pred)\n",
 77    "print(\"혼동 행렬:\")\n",
 78    "print(cm)\n",
 79    "print()\n",
 80    "\n",
 81    "# 혼동 행렬 요소 추출\n",
 82    "tn, fp, fn, tp = cm.ravel()\n",
 83    "print(f\"TN (True Negative): {tn}\")\n",
 84    "print(f\"FP (False Positive): {fp} - Type I Error (위양성)\")\n",
 85    "print(f\"FN (False Negative): {fn} - Type II Error (위음성)\")\n",
 86    "print(f\"TP (True Positive): {tp}\")"
 87   ]
 88  },
 89  {
 90   "cell_type": "code",
 91   "execution_count": null,
 92   "metadata": {},
 93   "outputs": [],
 94   "source": [
 95    "# 혼동 행렬 시각화\n",
 96    "fig, ax = plt.subplots(figsize=(8, 6))\n",
 97    "disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=['Negative', 'Positive'])\n",
 98    "disp.plot(ax=ax, cmap='Blues', values_format='d')\n",
 99    "plt.title('Confusion Matrix', fontsize=14, pad=20)\n",
100    "plt.show()"
101   ]
102  },
103  {
104   "cell_type": "markdown",
105   "metadata": {},
106   "source": [
107    "### 1.2 정확도, 정밀도, 재현율, F1-Score"
108   ]
109  },
110  {
111   "cell_type": "code",
112   "execution_count": null,
113   "metadata": {},
114   "outputs": [],
115   "source": [
116    "# 각 지표 계산\n",
117    "accuracy = accuracy_score(y_true, y_pred)\n",
118    "precision = precision_score(y_true, y_pred)\n",
119    "recall = recall_score(y_true, y_pred)\n",
120    "f1 = f1_score(y_true, y_pred)\n",
121    "\n",
122    "print(\"=== 분류 평가 지표 ===\")\n",
123    "print(f\"정확도 (Accuracy): {accuracy:.4f}\")\n",
124    "print(f\"  - (TP + TN) / (TP + TN + FP + FN)\")\n",
125    "print(f\"  - 전체 예측 중 정답 비율\\n\")\n",
126    "\n",
127    "print(f\"정밀도 (Precision): {precision:.4f}\")\n",
128    "print(f\"  - TP / (TP + FP)\")\n",
129    "print(f\"  - 양성으로 예측한 것 중 실제 양성의 비율\\n\")\n",
130    "\n",
131    "print(f\"재현율 (Recall/Sensitivity): {recall:.4f}\")\n",
132    "print(f\"  - TP / (TP + FN)\")\n",
133    "print(f\"  - 실제 양성 중 양성으로 예측한 비율\\n\")\n",
134    "\n",
135    "print(f\"F1-Score: {f1:.4f}\")\n",
136    "print(f\"  - 2 * (Precision * Recall) / (Precision + Recall)\")\n",
137    "print(f\"  - 정밀도와 재현율의 조화평균\")"
138   ]
139  },
140  {
141   "cell_type": "code",
142   "execution_count": null,
143   "metadata": {},
144   "outputs": [],
145   "source": [
146    "# 수동 계산으로 검증\n",
147    "accuracy_manual = (tp + tn) / (tp + tn + fp + fn)\n",
148    "precision_manual = tp / (tp + fp) if (tp + fp) > 0 else 0\n",
149    "recall_manual = tp / (tp + fn) if (tp + fn) > 0 else 0\n",
150    "f1_manual = 2 * precision_manual * recall_manual / (precision_manual + recall_manual) if (precision_manual + recall_manual) > 0 else 0\n",
151    "\n",
152    "print(\"=== 수동 계산 검증 ===\")\n",
153    "print(f\"Accuracy:  {accuracy_manual:.4f}\")\n",
154    "print(f\"Precision: {precision_manual:.4f}\")\n",
155    "print(f\"Recall:    {recall_manual:.4f}\")\n",
156    "print(f\"F1-Score:  {f1_manual:.4f}\")"
157   ]
158  },
159  {
160   "cell_type": "markdown",
161   "metadata": {},
162   "source": [
163    "### 1.3 실제 데이터셋으로 분류 평가 - Breast Cancer Dataset"
164   ]
165  },
166  {
167   "cell_type": "code",
168   "execution_count": null,
169   "metadata": {},
170   "outputs": [],
171   "source": [
172    "# 유방암 데이터셋 로드\n",
173    "cancer = load_breast_cancer()\n",
174    "X_train, X_test, y_train, y_test = train_test_split(\n",
175    "    cancer.data, cancer.target, test_size=0.2, random_state=42\n",
176    ")\n",
177    "\n",
178    "# 로지스틱 회귀 모델 학습\n",
179    "model = LogisticRegression(max_iter=10000, random_state=42)\n",
180    "model.fit(X_train, y_train)\n",
181    "y_pred = model.predict(X_test)\n",
182    "\n",
183    "print(\"Breast Cancer Classification Results\")\n",
184    "print(\"=\"*50)\n",
185    "print(f\"Training samples: {len(X_train)}\")\n",
186    "print(f\"Test samples: {len(X_test)}\")\n",
187    "print(f\"Features: {cancer.feature_names[:5]}... (total {len(cancer.feature_names)})\")"
188   ]
189  },
190  {
191   "cell_type": "code",
192   "execution_count": null,
193   "metadata": {},
194   "outputs": [],
195   "source": [
196    "# 혼동 행렬 시각화\n",
197    "cm = confusion_matrix(y_test, y_pred)\n",
198    "fig, ax = plt.subplots(figsize=(8, 6))\n",
199    "disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=['Malignant', 'Benign'])\n",
200    "disp.plot(ax=ax, cmap='RdYlGn', values_format='d')\n",
201    "plt.title('Confusion Matrix - Breast Cancer Classification', fontsize=14, pad=20)\n",
202    "plt.show()\n",
203    "\n",
204    "tn, fp, fn, tp = cm.ravel()\n",
205    "print(f\"\\nTrue Negatives: {tn}\")\n",
206    "print(f\"False Positives: {fp}\")\n",
207    "print(f\"False Negatives: {fn}\")\n",
208    "print(f\"True Positives: {tp}\")"
209   ]
210  },
211  {
212   "cell_type": "code",
213   "execution_count": null,
214   "metadata": {},
215   "outputs": [],
216   "source": [
217    "# 분류 리포트\n",
218    "report = classification_report(y_test, y_pred, target_names=['Malignant', 'Benign'])\n",
219    "print(\"\\n=== Classification Report ===\")\n",
220    "print(report)\n",
221    "\n",
222    "# 딕셔너리 형태로도 확인\n",
223    "report_dict = classification_report(y_test, y_pred, target_names=['Malignant', 'Benign'], output_dict=True)\n",
224    "print(f\"\\nBenign 클래스의 F1-score: {report_dict['Benign']['f1-score']:.4f}\")\n",
225    "print(f\"Malignant 클래스의 Recall: {report_dict['Malignant']['recall']:.4f}\")"
226   ]
227  },
228  {
229   "cell_type": "markdown",
230   "metadata": {},
231   "source": [
232    "### 1.4 ROC 곡선과 AUC"
233   ]
234  },
235  {
236   "cell_type": "code",
237   "execution_count": null,
238   "metadata": {},
239   "outputs": [],
240   "source": [
241    "# 예측 확률\n",
242    "y_proba = model.predict_proba(X_test)[:, 1]\n",
243    "\n",
244    "# ROC 곡선 계산\n",
245    "fpr, tpr, thresholds = roc_curve(y_test, y_proba)\n",
246    "roc_auc = auc(fpr, tpr)\n",
247    "\n",
248    "# ROC 곡선 시각화\n",
249    "plt.figure(figsize=(10, 6))\n",
250    "plt.plot(fpr, tpr, 'b-', linewidth=2, label=f'ROC Curve (AUC = {roc_auc:.4f})')\n",
251    "plt.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Random Classifier (AUC = 0.5)')\n",
252    "plt.xlabel('False Positive Rate (1 - Specificity)', fontsize=12)\n",
253    "plt.ylabel('True Positive Rate (Sensitivity/Recall)', fontsize=12)\n",
254    "plt.title('ROC Curve - Breast Cancer Classification', fontsize=14, pad=20)\n",
255    "plt.legend(loc='lower right', fontsize=11)\n",
256    "plt.grid(True, alpha=0.3)\n",
257    "plt.show()\n",
258    "\n",
259    "print(f\"AUC Score: {roc_auc:.4f}\")\n",
260    "print(f\"AUC Score (sklearn 직접 계산): {roc_auc_score(y_test, y_proba):.4f}\")\n",
261    "print(\"\\nAUC 해석:\")\n",
262    "print(\"  - 1.0: 완벽한 분류기\")\n",
263    "print(\"  - 0.5: 랜덤 분류기\")\n",
264    "print(\"  - 0.0: 최악의 분류기\")"
265   ]
266  },
267  {
268   "cell_type": "markdown",
269   "metadata": {},
270   "source": [
271    "### 1.5 Precision-Recall 곡선"
272   ]
273  },
274  {
275   "cell_type": "code",
276   "execution_count": null,
277   "metadata": {},
278   "outputs": [],
279   "source": [
280    "# PR 곡선 계산\n",
281    "precision, recall, pr_thresholds = precision_recall_curve(y_test, y_proba)\n",
282    "ap = average_precision_score(y_test, y_proba)\n",
283    "\n",
284    "# PR 곡선 시각화\n",
285    "plt.figure(figsize=(10, 6))\n",
286    "plt.plot(recall, precision, 'b-', linewidth=2, label=f'PR Curve (AP = {ap:.4f})')\n",
287    "plt.xlabel('Recall', fontsize=12)\n",
288    "plt.ylabel('Precision', fontsize=12)\n",
289    "plt.title('Precision-Recall Curve', fontsize=14, pad=20)\n",
290    "plt.legend(loc='best', fontsize=11)\n",
291    "plt.grid(True, alpha=0.3)\n",
292    "plt.xlim([0.0, 1.0])\n",
293    "plt.ylim([0.0, 1.05])\n",
294    "plt.show()\n",
295    "\n",
296    "print(f\"Average Precision (AP): {ap:.4f}\")\n",
297    "print(\"\\nROC vs PR 곡선:\")\n",
298    "print(\"  - ROC: 불균형 데이터에서도 안정적, 전반적인 성능 평가\")\n",
299    "print(\"  - PR: 불균형 데이터에서 더 민감, 양성 클래스 예측 성능에 집중\")"
300   ]
301  },
302  {
303   "cell_type": "code",
304   "execution_count": null,
305   "metadata": {},
306   "outputs": [],
307   "source": [
308    "# ROC와 PR 곡선 동시 비교\n",
309    "fig, axes = plt.subplots(1, 2, figsize=(16, 6))\n",
310    "\n",
311    "# ROC Curve\n",
312    "axes[0].plot(fpr, tpr, 'b-', linewidth=2, label=f'ROC (AUC = {roc_auc:.4f})')\n",
313    "axes[0].plot([0, 1], [0, 1], 'r--', linewidth=2)\n",
314    "axes[0].set_xlabel('False Positive Rate', fontsize=12)\n",
315    "axes[0].set_ylabel('True Positive Rate', fontsize=12)\n",
316    "axes[0].set_title('ROC Curve', fontsize=14)\n",
317    "axes[0].legend(loc='lower right', fontsize=11)\n",
318    "axes[0].grid(True, alpha=0.3)\n",
319    "\n",
320    "# PR Curve\n",
321    "axes[1].plot(recall, precision, 'g-', linewidth=2, label=f'PR (AP = {ap:.4f})')\n",
322    "axes[1].set_xlabel('Recall', fontsize=12)\n",
323    "axes[1].set_ylabel('Precision', fontsize=12)\n",
324    "axes[1].set_title('Precision-Recall Curve', fontsize=14)\n",
325    "axes[1].legend(loc='best', fontsize=11)\n",
326    "axes[1].grid(True, alpha=0.3)\n",
327    "\n",
328    "plt.tight_layout()\n",
329    "plt.show()"
330   ]
331  },
332  {
333   "cell_type": "markdown",
334   "metadata": {},
335   "source": [
336    "## 2. 다중 분류 평가"
337   ]
338  },
339  {
340   "cell_type": "code",
341   "execution_count": null,
342   "metadata": {},
343   "outputs": [],
344   "source": [
345    "# Iris 데이터셋 로드 (3개 클래스)\n",
346    "iris = load_iris()\n",
347    "X_train_iris, X_test_iris, y_train_iris, y_test_iris = train_test_split(\n",
348    "    iris.data, iris.target, test_size=0.2, random_state=42\n",
349    ")\n",
350    "\n",
351    "# 모델 학습\n",
352    "model_iris = LogisticRegression(max_iter=1000, random_state=42)\n",
353    "model_iris.fit(X_train_iris, y_train_iris)\n",
354    "y_pred_iris = model_iris.predict(X_test_iris)\n",
355    "\n",
356    "print(\"Iris Multi-class Classification\")\n",
357    "print(\"=\"*50)\n",
358    "print(f\"Classes: {iris.target_names}\")\n",
359    "print(f\"Features: {iris.feature_names}\")"
360   ]
361  },
362  {
363   "cell_type": "code",
364   "execution_count": null,
365   "metadata": {},
366   "outputs": [],
367   "source": [
368    "# 다중 클래스 혼동 행렬\n",
369    "cm_iris = confusion_matrix(y_test_iris, y_pred_iris)\n",
370    "fig, ax = plt.subplots(figsize=(10, 8))\n",
371    "disp = ConfusionMatrixDisplay(confusion_matrix=cm_iris, display_labels=iris.target_names)\n",
372    "disp.plot(ax=ax, cmap='Blues', values_format='d')\n",
373    "plt.title('Multi-class Confusion Matrix - Iris Dataset', fontsize=14, pad=20)\n",
374    "plt.show()"
375   ]
376  },
377  {
378   "cell_type": "code",
379   "execution_count": null,
380   "metadata": {},
381   "outputs": [],
382   "source": [
383    "# 다중 분류 지표\n",
384    "print(\"=== Multi-class Classification Metrics ===\")\n",
385    "print(f\"정확도: {accuracy_score(y_test_iris, y_pred_iris):.4f}\\n\")\n",
386    "\n",
387    "# F1-score의 다양한 평균 방법\n",
388    "f1_macro = f1_score(y_test_iris, y_pred_iris, average='macro')\n",
389    "f1_weighted = f1_score(y_test_iris, y_pred_iris, average='weighted')\n",
390    "f1_micro = f1_score(y_test_iris, y_pred_iris, average='micro')\n",
391    "\n",
392    "print(f\"F1-Score (macro):    {f1_macro:.4f}  - 각 클래스의 F1을 단순 평균\")\n",
393    "print(f\"F1-Score (weighted): {f1_weighted:.4f}  - 각 클래스의 샘플 수로 가중 평균\")\n",
394    "print(f\"F1-Score (micro):    {f1_micro:.4f}  - 전체 TP, FP, FN을 합산하여 계산\")"
395   ]
396  },
397  {
398   "cell_type": "code",
399   "execution_count": null,
400   "metadata": {},
401   "outputs": [],
402   "source": [
403    "# 분류 리포트\n",
404    "report_iris = classification_report(y_test_iris, y_pred_iris, target_names=iris.target_names)\n",
405    "print(\"\\n=== Classification Report - Iris ===\")\n",
406    "print(report_iris)"
407   ]
408  },
409  {
410   "cell_type": "code",
411   "execution_count": null,
412   "metadata": {},
413   "outputs": [],
414   "source": [
415    "# 다중 클래스 ROC 곡선\n",
416    "y_test_iris_bin = label_binarize(y_test_iris, classes=[0, 1, 2])\n",
417    "y_proba_iris = model_iris.predict_proba(X_test_iris)\n",
418    "\n",
419    "plt.figure(figsize=(10, 6))\n",
420    "colors = ['blue', 'red', 'green']\n",
421    "\n",
422    "for i, (color, name) in enumerate(zip(colors, iris.target_names)):\n",
423    "    fpr_i, tpr_i, _ = roc_curve(y_test_iris_bin[:, i], y_proba_iris[:, i])\n",
424    "    roc_auc_i = auc(fpr_i, tpr_i)\n",
425    "    plt.plot(fpr_i, tpr_i, color=color, linewidth=2,\n",
426    "             label=f'{name} (AUC = {roc_auc_i:.4f})')\n",
427    "\n",
428    "plt.plot([0, 1], [0, 1], 'k--', linewidth=2)\n",
429    "plt.xlabel('False Positive Rate', fontsize=12)\n",
430    "plt.ylabel('True Positive Rate', fontsize=12)\n",
431    "plt.title('Multi-class ROC Curves - Iris Dataset', fontsize=14, pad=20)\n",
432    "plt.legend(loc='lower right', fontsize=11)\n",
433    "plt.grid(True, alpha=0.3)\n",
434    "plt.show()"
435   ]
436  },
437  {
438   "cell_type": "markdown",
439   "metadata": {},
440   "source": [
441    "## 3. 회귀 평가 지표"
442   ]
443  },
444  {
445   "cell_type": "code",
446   "execution_count": null,
447   "metadata": {},
448   "outputs": [],
449   "source": [
450    "# 간단한 예시\n",
451    "y_true_reg = np.array([3.0, -0.5, 2.0, 7.0, 4.5])\n",
452    "y_pred_reg = np.array([2.5, 0.0, 2.0, 8.0, 4.0])\n",
453    "\n",
454    "# 회귀 지표 계산\n",
455    "mae = mean_absolute_error(y_true_reg, y_pred_reg)\n",
456    "mse = mean_squared_error(y_true_reg, y_pred_reg)\n",
457    "rmse = np.sqrt(mse)\n",
458    "r2 = r2_score(y_true_reg, y_pred_reg)\n",
459    "\n",
460    "print(\"=== 회귀 평가 지표 ===\")\n",
461    "print(f\"MAE (Mean Absolute Error): {mae:.4f}\")\n",
462    "print(f\"  - 평균적으로 예측이 실제값에서 {mae:.4f} 만큼 벗어남\\n\")\n",
463    "\n",
464    "print(f\"MSE (Mean Squared Error): {mse:.4f}\")\n",
465    "print(f\"  - 큰 오차에 더 큰 패널티\\n\")\n",
466    "\n",
467    "print(f\"RMSE (Root Mean Squared Error): {rmse:.4f}\")\n",
468    "print(f\"  - 타겟과 같은 단위로 해석 가능\\n\")\n",
469    "\n",
470    "print(f\"R² (Coefficient of Determination): {r2:.4f}\")\n",
471    "print(f\"  - 0~1, 1에 가까울수록 좋음\")\n",
472    "print(f\"  - 모델이 분산의 {r2*100:.1f}%를 설명\")"
473   ]
474  },
475  {
476   "cell_type": "code",
477   "execution_count": null,
478   "metadata": {},
479   "outputs": [],
480   "source": [
481    "# 수동 계산으로 검증\n",
482    "print(\"\\n=== 수동 계산 검증 ===\")\n",
483    "mae_manual = np.mean(np.abs(y_true_reg - y_pred_reg))\n",
484    "mse_manual = np.mean((y_true_reg - y_pred_reg)**2)\n",
485    "rmse_manual = np.sqrt(mse_manual)\n",
486    "r2_manual = 1 - np.sum((y_true_reg - y_pred_reg)**2) / np.sum((y_true_reg - np.mean(y_true_reg))**2)\n",
487    "\n",
488    "print(f\"MAE:  {mae_manual:.4f}\")\n",
489    "print(f\"MSE:  {mse_manual:.4f}\")\n",
490    "print(f\"RMSE: {rmse_manual:.4f}\")\n",
491    "print(f\"R²:   {r2_manual:.4f}\")"
492   ]
493  },
494  {
495   "cell_type": "code",
496   "execution_count": null,
497   "metadata": {},
498   "outputs": [],
499   "source": [
500    "# 실제 데이터셋으로 회귀 평가 - Diabetes Dataset\n",
501    "diabetes = load_diabetes()\n",
502    "X_train_diab, X_test_diab, y_train_diab, y_test_diab = train_test_split(\n",
503    "    diabetes.data, diabetes.target, test_size=0.2, random_state=42\n",
504    ")\n",
505    "\n",
506    "# 선형 회귀 모델 학습\n",
507    "model_reg = LinearRegression()\n",
508    "model_reg.fit(X_train_diab, y_train_diab)\n",
509    "y_pred_diab = model_reg.predict(X_test_diab)\n",
510    "\n",
511    "# 평가\n",
512    "mae_diab = mean_absolute_error(y_test_diab, y_pred_diab)\n",
513    "mse_diab = mean_squared_error(y_test_diab, y_pred_diab)\n",
514    "rmse_diab = np.sqrt(mse_diab)\n",
515    "r2_diab = r2_score(y_test_diab, y_pred_diab)\n",
516    "\n",
517    "print(\"Diabetes Regression Results\")\n",
518    "print(\"=\"*50)\n",
519    "print(f\"MAE:  {mae_diab:.4f}\")\n",
520    "print(f\"MSE:  {mse_diab:.4f}\")\n",
521    "print(f\"RMSE: {rmse_diab:.4f}\")\n",
522    "print(f\"R²:   {r2_diab:.4f}\")\n",
523    "print(f\"\\n해석: 모델이 타겟 분산의 {r2_diab*100:.1f}%를 설명합니다.\")"
524   ]
525  },
526  {
527   "cell_type": "code",
528   "execution_count": null,
529   "metadata": {},
530   "outputs": [],
531   "source": [
532    "# 실제값 vs 예측값 시각화\n",
533    "plt.figure(figsize=(10, 6))\n",
534    "plt.scatter(y_test_diab, y_pred_diab, alpha=0.6, edgecolors='k', s=80)\n",
535    "plt.plot([y_test_diab.min(), y_test_diab.max()], \n",
536    "         [y_test_diab.min(), y_test_diab.max()], \n",
537    "         'r--', linewidth=2, label='Perfect Prediction')\n",
538    "plt.xlabel('실제값 (Actual)', fontsize=12)\n",
539    "plt.ylabel('예측값 (Predicted)', fontsize=12)\n",
540    "plt.title(f'실제값 vs 예측값 (R² = {r2_diab:.4f})', fontsize=14, pad=20)\n",
541    "plt.legend(fontsize=11)\n",
542    "plt.grid(True, alpha=0.3)\n",
543    "plt.show()"
544   ]
545  },
546  {
547   "cell_type": "code",
548   "execution_count": null,
549   "metadata": {},
550   "outputs": [],
551   "source": [
552    "# 잔차 분석\n",
553    "residuals = y_test_diab - y_pred_diab\n",
554    "\n",
555    "fig, axes = plt.subplots(1, 2, figsize=(16, 6))\n",
556    "\n",
557    "# 잔차 플롯\n",
558    "axes[0].scatter(y_pred_diab, residuals, alpha=0.6, edgecolors='k', s=80)\n",
559    "axes[0].axhline(y=0, color='r', linestyle='--', linewidth=2)\n",
560    "axes[0].set_xlabel('예측값 (Predicted)', fontsize=12)\n",
561    "axes[0].set_ylabel('잔차 (Residuals)', fontsize=12)\n",
562    "axes[0].set_title('Residual Plot', fontsize=14)\n",
563    "axes[0].grid(True, alpha=0.3)\n",
564    "\n",
565    "# 잔차 분포\n",
566    "axes[1].hist(residuals, bins=20, edgecolor='black', alpha=0.7)\n",
567    "axes[1].set_xlabel('잔차 (Residuals)', fontsize=12)\n",
568    "axes[1].set_ylabel('빈도 (Frequency)', fontsize=12)\n",
569    "axes[1].set_title('Residuals Distribution', fontsize=14)\n",
570    "axes[1].grid(True, alpha=0.3, axis='y')\n",
571    "\n",
572    "plt.tight_layout()\n",
573    "plt.show()\n",
574    "\n",
575    "print(f\"잔차 평균: {residuals.mean():.4f} (0에 가까워야 함)\")\n",
576    "print(f\"잔차 표준편차: {residuals.std():.4f}\")"
577   ]
578  },
579  {
580   "cell_type": "markdown",
581   "metadata": {},
582   "source": [
583    "## 4. 학습 곡선 (Learning Curve)"
584   ]
585  },
586  {
587   "cell_type": "code",
588   "execution_count": null,
589   "metadata": {},
590   "outputs": [],
591   "source": [
592    "# 학습 곡선 계산\n",
593    "train_sizes, train_scores, val_scores = learning_curve(\n",
594    "    LogisticRegression(max_iter=10000, random_state=42),\n",
595    "    cancer.data, cancer.target,\n",
596    "    train_sizes=np.linspace(0.1, 1.0, 10),\n",
597    "    cv=5,\n",
598    "    scoring='accuracy',\n",
599    "    n_jobs=-1\n",
600    ")\n",
601    "\n",
602    "# 평균 및 표준편차\n",
603    "train_mean = train_scores.mean(axis=1)\n",
604    "train_std = train_scores.std(axis=1)\n",
605    "val_mean = val_scores.mean(axis=1)\n",
606    "val_std = val_scores.std(axis=1)\n",
607    "\n",
608    "# 학습 곡선 시각화\n",
609    "plt.figure(figsize=(10, 6))\n",
610    "plt.fill_between(train_sizes, train_mean - train_std, train_mean + train_std, \n",
611    "                 alpha=0.2, color='blue')\n",
612    "plt.fill_between(train_sizes, val_mean - val_std, val_mean + val_std, \n",
613    "                 alpha=0.2, color='orange')\n",
614    "plt.plot(train_sizes, train_mean, 'o-', color='blue', linewidth=2, \n",
615    "         label='Training Score')\n",
616    "plt.plot(train_sizes, val_mean, 'o-', color='orange', linewidth=2, \n",
617    "         label='Validation Score')\n",
618    "plt.xlabel('Training Set Size', fontsize=12)\n",
619    "plt.ylabel('Accuracy', fontsize=12)\n",
620    "plt.title('Learning Curve - Breast Cancer Classification', fontsize=14, pad=20)\n",
621    "plt.legend(loc='best', fontsize=11)\n",
622    "plt.grid(True, alpha=0.3)\n",
623    "plt.show()\n",
624    "\n",
625    "print(\"학습 곡선 해석:\")\n",
626    "print(\"  - 두 곡선이 모두 낮음 → 과소적합 (더 복잡한 모델 필요)\")\n",
627    "print(\"  - 훈련 곡선 높고 검증 곡선 낮음 → 과적합 (정규화 필요)\")\n",
628    "print(\"  - 두 곡선이 수렴 → 적절한 적합\")"
629   ]
630  },
631  {
632   "cell_type": "markdown",
633   "metadata": {},
634   "source": [
635    "## 5. 평가 지표 선택 가이드"
636   ]
637  },
638  {
639   "cell_type": "code",
640   "execution_count": null,
641   "metadata": {},
642   "outputs": [],
643   "source": [
644    "# 종합 평가 함수\n",
645    "def evaluate_classification(y_true, y_pred, y_proba=None):\n",
646    "    \"\"\"분류 모델 종합 평가\"\"\"\n",
647    "    print(\"=== 분류 평가 결과 ===\")\n",
648    "    print(f\"Accuracy:  {accuracy_score(y_true, y_pred):.4f}\")\n",
649    "    print(f\"Precision: {precision_score(y_true, y_pred, average='weighted'):.4f}\")\n",
650    "    print(f\"Recall:    {recall_score(y_true, y_pred, average='weighted'):.4f}\")\n",
651    "    print(f\"F1-Score:  {f1_score(y_true, y_pred, average='weighted'):.4f}\")\n",
652    "    if y_proba is not None and len(np.unique(y_true)) == 2:\n",
653    "        print(f\"ROC-AUC:   {roc_auc_score(y_true, y_proba):.4f}\")\n",
654    "\n",
655    "def evaluate_regression(y_true, y_pred):\n",
656    "    \"\"\"회귀 모델 종합 평가\"\"\"\n",
657    "    print(\"=== 회귀 평가 결과 ===\")\n",
658    "    print(f\"MAE:  {mean_absolute_error(y_true, y_pred):.4f}\")\n",
659    "    print(f\"MSE:  {mean_squared_error(y_true, y_pred):.4f}\")\n",
660    "    print(f\"RMSE: {np.sqrt(mean_squared_error(y_true, y_pred)):.4f}\")\n",
661    "    print(f\"R²:   {r2_score(y_true, y_pred):.4f}\")\n",
662    "\n",
663    "# 테스트\n",
664    "print(\"Breast Cancer 모델 평가:\")\n",
665    "evaluate_classification(y_test, y_pred, y_proba)\n",
666    "\n",
667    "print(\"\\nDiabetes 회귀 모델 평가:\")\n",
668    "evaluate_regression(y_test_diab, y_pred_diab)"
669   ]
670  },
671  {
672   "cell_type": "code",
673   "execution_count": null,
674   "metadata": {},
675   "outputs": [],
676   "source": [
677    "# 평가 지표 요약 표\n",
678    "import pandas as pd\n",
679    "\n",
680    "metrics_summary = pd.DataFrame({\n",
681    "    '지표': ['Accuracy', 'Precision', 'Recall', 'F1-Score', 'ROC-AUC', 'MAE', 'MSE', 'R²'],\n",
682    "    '분류/회귀': ['분류', '분류', '분류', '분류', '분류', '회귀', '회귀', '회귀'],\n",
683    "    '범위': ['0-1', '0-1', '0-1', '0-1', '0-1', '0-∞', '0-∞', '-∞-1'],\n",
684    "    '설명': [\n",
685    "        '전체 정답 비율',\n",
686    "        '양성 예측 중 실제 양성',\n",
687    "        '실제 양성 중 양성 예측',\n",
688    "        'Precision/Recall 조화평균',\n",
689    "        '분류기 전반적 성능',\n",
690    "        '평균 절대 오차',\n",
691    "        '평균 제곱 오차',\n",
692    "        '설명 분산 비율'\n",
693    "    ]\n",
694    "})\n",
695    "\n",
696    "print(\"\\n=== 평가 지표 요약 ===\")\n",
697    "print(metrics_summary.to_string(index=False))"
698   ]
699  },
700  {
701   "cell_type": "markdown",
702   "metadata": {},
703   "source": [
704    "## 요약\n",
705    "\n",
706    "### 분류 문제 지표 선택\n",
707    "\n",
708    "1. **균형 데이터**: Accuracy, F1-score\n",
709    "2. **불균형 데이터**: Precision, Recall, F1-score, PR-AUC\n",
710    "   - 양성 클래스가 중요: Recall 중시 (암 진단, 사기 탐지)\n",
711    "   - 오탐이 비용: Precision 중시 (스팸 필터)\n",
712    "3. **확률 예측 품질**: ROC-AUC, PR-AUC\n",
713    "4. **다중 분류**: Macro/Weighted/Micro F1\n",
714    "\n",
715    "### 회귀 문제 지표 선택\n",
716    "\n",
717    "1. **기본**: MSE, RMSE, MAE\n",
718    "2. **이상치 민감도**: MAE (robust), MSE (sensitive)\n",
719    "3. **상대적 오차**: R²\n",
720    "4. **모델 비교**: R² (0~1 범위로 정규화)"
721   ]
722  }
723 ],
724 "metadata": {
725  "kernelspec": {
726   "display_name": "Python 3",
727   "language": "python",
728   "name": "python3"
729  },
730  "language_info": {
731   "codemirror_mode": {
732    "name": "ipython",
733    "version": 3
734   },
735   "file_extension": ".py",
736   "mimetype": "text/x-python",
737   "name": "python",
738   "nbconvert_exporter": "python",
739   "pygments_lexer": "ipython3",
740   "version": "3.8.0"
741  }
742 },
743 "nbformat": 4,
744 "nbformat_minor": 4
745}