README.md

Download
markdown 110 lines 3.4 KB
  1# Machine Learning 예제
  2
  3Machine_Learning ν΄λ”μ˜ 14개 λ ˆμŠ¨μ— ν•΄λ‹Ήν•˜λŠ” μ‹€ν–‰ κ°€λŠ₯ν•œ Jupyter Notebook μ˜ˆμ œμž…λ‹ˆλ‹€.
  4
  5## 폴더 ꡬ쑰
  6
  7```
  8examples/
  9β”œβ”€β”€ 01_linear_regression.ipynb      # μ„ ν˜• νšŒκ·€
 10β”œβ”€β”€ 02_logistic_regression.ipynb    # λ‘œμ§€μŠ€ν‹± νšŒκ·€
 11β”œβ”€β”€ 03_model_evaluation.ipynb       # λͺ¨λΈ 평가 μ§€ν‘œ
 12β”œβ”€β”€ 04_cross_validation.ipynb       # ꡐ차 검증
 13β”œβ”€β”€ 05_preprocessing.ipynb          # 데이터 μ „μ²˜λ¦¬
 14β”œβ”€β”€ 06_decision_tree.ipynb          # κ²°μ • 트리
 15β”œβ”€β”€ 07_random_forest.ipynb          # 랜덀 포레슀트
 16β”œβ”€β”€ 08_xgboost_lightgbm.ipynb       # XGBoost, LightGBM
 17β”œβ”€β”€ 09_svm.ipynb                    # SVM (Support Vector Machine)
 18β”œβ”€β”€ 10_knn_naive_bayes.ipynb        # k-NN, λ‚˜μ΄λΈŒ 베이즈
 19β”œβ”€β”€ 11_clustering.ipynb             # K-Means, DBSCAN
 20β”œβ”€β”€ 12_pca.ipynb                    # PCA, t-SNE 차원 μΆ•μ†Œ
 21β”œβ”€β”€ 13_pipeline.ipynb               # sklearn νŒŒμ΄ν”„λΌμΈ
 22β”œβ”€β”€ 14_kaggle_project.ipynb         # μ‹€μ „ Kaggle ν”„λ‘œμ νŠΈ
 23β”œβ”€β”€ datasets/                       # 예제 데이터셋
 24└── README.md
 25```
 26
 27## μ‹€ν–‰ 방법
 28
 29### ν™˜κ²½ μ„€μ •
 30
 31```bash
 32# κ°€μƒν™˜κ²½ 생성 (ꢌμž₯)
 33python -m venv ml-env
 34source ml-env/bin/activate  # Windows: ml-env\Scripts\activate
 35
 36# ν•„μš”ν•œ νŒ¨ν‚€μ§€ μ„€μΉ˜
 37pip install numpy pandas matplotlib seaborn scikit-learn jupyter
 38
 39# XGBoost, LightGBM (08 레슨용)
 40pip install xgboost lightgbm
 41```
 42
 43### Jupyter Notebook μ‹€ν–‰
 44
 45```bash
 46cd Machine_Learning/examples
 47jupyter notebook
 48
 49# λ˜λŠ” JupyterLab
 50jupyter lab
 51```
 52
 53## λ ˆμŠ¨λ³„ 예제 λͺ©λ‘
 54
 55| 레슨 | 주제 | 핡심 λ‚΄μš© |
 56|------|------|----------|
 57| 01 | μ„ ν˜• νšŒκ·€ | λ‹¨μˆœ/닀쀑 νšŒκ·€, MSE, RΒ² |
 58| 02 | λ‘œμ§€μŠ€ν‹± νšŒκ·€ | 이진/닀쀑 λΆ„λ₯˜, ROC-AUC |
 59| 03 | λͺ¨λΈ 평가 | 정확도, 정밀도, μž¬ν˜„μœ¨, F1 |
 60| 04 | ꡐ차 검증 | K-Fold, Stratified, GridSearchCV |
 61| 05 | μ „μ²˜λ¦¬ | μŠ€μΌ€μΌλ§, 인코딩, 결츑치 |
 62| 06 | κ²°μ • 트리 | 트리 μ‹œκ°ν™”, 과적합 λ°©μ§€ |
 63| 07 | 랜덀 포레슀트 | λ°°κΉ…, OOB, νŠΉμ„± μ€‘μš”λ„ |
 64| 08 | XGBoost/LightGBM | κ·Έλž˜λ””μ–ΈνŠΈ λΆ€μŠ€νŒ…, μ‘°κΈ° μ’…λ£Œ |
 65| 09 | SVM | 컀널 트릭, ν•˜μ΄νΌν”Œλ ˆμΈ |
 66| 10 | k-NN/λ‚˜μ΄λΈŒ 베이즈 | 거리 기반, ν™•λ₯  기반 λΆ„λ₯˜ |
 67| 11 | ν΄λŸ¬μŠ€ν„°λ§ | K-Means, DBSCAN, 싀루엣 |
 68| 12 | 차원 μΆ•μ†Œ | PCA, t-SNE, μ„€λͺ… λΆ„μ‚° |
 69| 13 | νŒŒμ΄ν”„λΌμΈ | Pipeline, ColumnTransformer |
 70| 14 | Kaggle ν”„λ‘œμ νŠΈ | Titanic, νŠΉμ„± 곡학 |
 71
 72## ν•™μŠ΅ μˆœμ„œ
 73
 741. **기초**: 01 β†’ 02 β†’ 03 β†’ 04 β†’ 05
 752. **트리 λͺ¨λΈ**: 06 β†’ 07 β†’ 08
 763. **기타 μ•Œκ³ λ¦¬μ¦˜**: 09 β†’ 10
 774. **비지도 ν•™μŠ΅**: 11 β†’ 12
 785. **μ‹€μ „**: 13 β†’ 14
 79
 80## 데이터셋
 81
 82μ˜ˆμ œμ—μ„œ μ‚¬μš©ν•˜λŠ” 데이터셋:
 83
 84| 데이터셋 | 좜처 | μš©λ„ |
 85|---------|------|------|
 86| Iris | sklearn | λΆ„λ₯˜ (닀쀑 클래슀) |
 87| Wine | sklearn | λΆ„λ₯˜ (닀쀑 클래슀) |
 88| California Housing | sklearn | νšŒκ·€ |
 89| Digits | sklearn | λΆ„λ₯˜ (이미지) |
 90| Titanic | Kaggle | λΆ„λ₯˜ (μ‹€μ „) |
 91
 92## ν•„μš” νŒ¨ν‚€μ§€
 93
 94```
 95numpy>=1.21.0
 96pandas>=1.3.0
 97matplotlib>=3.4.0
 98seaborn>=0.11.0
 99scikit-learn>=1.0.0
100jupyter>=1.0.0
101xgboost>=1.5.0      # 08 레슨
102lightgbm>=3.3.0     # 08 레슨
103```
104
105## 참고 자료
106
107- [scikit-learn 곡식 λ¬Έμ„œ](https://scikit-learn.org/stable/)
108- [Kaggle](https://www.kaggle.com/)
109- [Machine Learning Mastery](https://machinelearningmastery.com/)