MLOps Overview

MLOps Overview

1. What is MLOps?

MLOps (Machine Learning Operations) is a practical methodology for automating and efficiently managing the development, deployment, and operation of machine learning models.

Problems with Traditional ML Projects:
- Models only work in notebooks
- Experiments cannot be reproduced
- Difficult to detect performance degradation after deployment
- Manual retraining required

Goals of MLOps:
- Automated ML pipelines
- Reproducible experiments
- Continuous model monitoring
- Automated retraining and deployment

1.1 Core Principles of MLOps

"""
Core Principles of MLOps
"""

# 1. Reproducibility
# - Must be able to get identical results with the same data and code
experiment_config = {
    "data_version": "v1.2.0",
    "code_version": "git:abc123",
    "random_seed": 42,
    "hyperparameters": {"lr": 0.001, "epochs": 100}
}

# 2. Automation
# - Minimize manual work, automate with pipelines
pipeline_stages = [
    "data_validation",
    "feature_engineering",
    "model_training",
    "model_evaluation",
    "model_deployment"
]

# 3. Monitoring
# - Continuously monitor model performance and data quality
monitoring_metrics = {
    "model_accuracy": 0.95,
    "prediction_latency_p99": "50ms",
    "data_drift_score": 0.02
}

# 4. Versioning
# - Version control for data, code, and models
versions = {
    "data": "s3://bucket/data/v2.0/",
    "model": "models/classifier_v3.2.1",
    "config": "configs/production.yaml"
}

2. DevOps vs MLOps

2.1 Traditional DevOps

Developer β†’ Write Code β†’ Build β†’ Test β†’ Deploy β†’ Monitor
                   ↑                        β”‚
                   └────────── Feedback β”€β”€β”€β”€β”˜

2.2 Additional Elements in MLOps

Data Scientist β†’ Data β†’ Feature Engineering β†’ Model Training β†’ Validation β†’ Deploy β†’ Monitor
                 ↑                                                          β”‚
                 β”‚            ← Detect Data Drift β†β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚            ← Detect Model Performance Degradation β†β”€β”€β”€β”€β”˜
                 └─────────────── Retrain Trigger β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

2.3 Key Differences

"""
DevOps vs MLOps Differences
"""

# DevOps: Code-centric
devops_artifacts = {
    "source": "application_code",
    "build": "docker_image",
    "test": "unit_tests, integration_tests",
    "deploy": "kubernetes_manifests"
}

# MLOps: Data + Code + Model
mlops_artifacts = {
    "data": "training_data, validation_data",
    "features": "feature_definitions, feature_store",
    "code": "training_scripts, serving_code",
    "model": "model_weights, model_metadata",
    "experiments": "hyperparameters, metrics, artifacts"
}

# MLOps-specific Challenges
mlops_challenges = [
    "Data quality management",
    "Feature consistency maintenance",
    "Model version control",
    "A/B testing",
    "Drift detection",
    "Model interpretability"
]
Category DevOps MLOps
Main Artifacts Application code Data + Model + Code
Testing Unit/Integration tests + Data validation, Model validation
Deployment Unit Container/Service Model + Serving infrastructure
Monitoring System metrics + Model performance, Data drift
Rollback Deploy previous version Model version rollback + Data considerations

3. MLOps Maturity Levels

3.1 Google's MLOps Maturity Model

Level 0: Manual Process
β”œβ”€β”€ Experimentation in Jupyter notebooks
β”œβ”€β”€ Manual model deployment
β”œβ”€β”€ No monitoring
└── No retraining triggers

Level 1: ML Pipeline Automation
β”œβ”€β”€ Automated training pipeline
β”œβ”€β”€ Experiment tracking
β”œβ”€β”€ Model registry
└── Basic monitoring

Level 2: CI/CD Pipeline
β”œβ”€β”€ Automated testing (code + data + model)
β”œβ”€β”€ Continuous Training (CT)
β”œβ”€β”€ Automated redeployment
└── Complete monitoring and alerting

3.2 Characteristics by Maturity Level

"""
Implementation by MLOps Maturity Level
"""

# Level 0: Manual ML
class Level0_ManualML:
    """
    - Data scientists experiment in notebooks
    - Engineers manually deploy models
    - Infrequent deployment (quarterly/annual)
    """
    def train(self):
        # Run in notebook
        model = train_model(data)
        model.save("model.pkl")
        # Manually copy to server

# Level 1: ML Pipeline
class Level1_Pipeline:
    """
    - Automated training pipeline
    - Experiment tracking (MLflow)
    - Model registry
    """
    def train_pipeline(self):
        with mlflow.start_run():
            data = load_data()
            model = train_model(data)
            mlflow.log_metrics(evaluate(model))
            mlflow.sklearn.log_model(model, "model")

# Level 2: CI/CD/CT
class Level2_CICDCT:
    """
    - Automated testing on code changes
    - Automated retraining on data changes
    - Automated rollback on performance degradation
    """
    def continuous_training(self):
        if detect_drift() or new_data_available():
            trigger_training_pipeline()
        if model_performance_degraded():
            rollback_to_previous_version()

3.3 Maturity Checklist

# mlops_maturity_checklist.yaml
level_0:
  - Notebook-based experimentation
  - Manual model deployment
  - No documentation

level_1:
  - Automated training pipeline
  - Experiment tracking system (MLflow/W&B)
  - Model version control
  - Basic monitoring

level_2:
  - CI/CD for ML
  - Automated testing (data/model)
  - Continuous Training
  - Drift detection and alerting
  - A/B testing infrastructure
  - Feature store

4. MLOps Tool Ecosystem

4.1 Tools by Category

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         MLOps Tool Ecosystem                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                     β”‚
β”‚  [Experiment Tracking]  [Pipeline]           [Feature Store]       β”‚
β”‚  - MLflow               - Kubeflow           - Feast               β”‚
β”‚  - Weights & Biases     - Airflow            - Tecton              β”‚
β”‚  - Neptune              - Prefect            - Hopsworks           β”‚
β”‚  - Comet ML             - Dagster                                  β”‚
β”‚                                                                     β”‚
β”‚  [Model Registry]       [Serving]            [Monitoring]          β”‚
β”‚  - MLflow Registry      - TorchServe         - Evidently           β”‚
β”‚  - Vertex AI            - Triton             - Grafana             β”‚
β”‚  - SageMaker            - TFServing          - Prometheus          β”‚
β”‚  - Neptune              - BentoML            - WhyLabs             β”‚
β”‚                         - Seldon                                   β”‚
β”‚                                                                     β”‚
β”‚  [Data Versioning]      [Labeling]           [Infrastructure]      β”‚
β”‚  - DVC                  - Label Studio       - Kubernetes          β”‚
β”‚  - Delta Lake           - Labelbox           - Docker              β”‚
β”‚  - LakeFS               - Prodigy            - Terraform           β”‚
β”‚                                                                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

4.2 Major Tool Comparison

"""
MLOps Tool Comparison
"""

# Experiment tracking tools
experiment_tracking = {
    "MLflow": {
        "type": "Open source",
        "features": ["Experiment tracking", "Model registry", "Serving"],
        "deployment": "self-hosted / managed",
        "best_for": "Enterprise, customization"
    },
    "Weights & Biases": {
        "type": "Commercial (free tier available)",
        "features": ["Experiment tracking", "Hyperparameter tuning", "Artifacts"],
        "deployment": "SaaS / self-hosted",
        "best_for": "Deep learning, collaboration, visualization"
    },
    "Neptune": {
        "type": "Commercial (free tier available)",
        "features": ["Experiment tracking", "Metadata management"],
        "deployment": "SaaS",
        "best_for": "Large-scale experiment management"
    }
}

# Serving tools
serving_tools = {
    "TorchServe": {
        "supported": ["PyTorch"],
        "features": ["REST/gRPC", "Batch inference", "A/B testing"],
        "best_for": "PyTorch models"
    },
    "Triton": {
        "supported": ["PyTorch", "TensorFlow", "ONNX", "TensorRT"],
        "features": ["Multi-model", "GPU optimization", "Dynamic batching"],
        "best_for": "High-performance inference, multi-framework"
    },
    "TFServing": {
        "supported": ["TensorFlow"],
        "features": ["REST/gRPC", "Version management"],
        "best_for": "TensorFlow models"
    }
}

4.3 Cloud Managed Services

"""
Cloud MLOps Platforms
"""

cloud_platforms = {
    "AWS SageMaker": {
        "components": [
            "SageMaker Studio",      # Development environment
            "SageMaker Pipelines",   # Pipeline
            "Model Registry",         # Model management
            "SageMaker Endpoints",   # Serving
            "Model Monitor"          # Monitoring
        ]
    },
    "Google Vertex AI": {
        "components": [
            "Workbench",             # Development environment
            "Vertex Pipelines",      # Pipeline
            "Model Registry",         # Model management
            "Prediction",            # Serving
            "Model Monitoring"       # Monitoring
        ]
    },
    "Azure ML": {
        "components": [
            "Azure ML Studio",       # Development environment
            "Azure Pipelines",       # Pipeline
            "Model Registry",         # Model management
            "Managed Endpoints",     # Serving
            "Data Drift"             # Monitoring
        ]
    }
}

5. MLOps Architecture Patterns

5.1 Basic Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Basic MLOps Architecture                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                  β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚   β”‚  Data   │────▢│   Feature   │────▢│   Model     β”‚           β”‚
β”‚   β”‚  Lake   β”‚     β”‚   Store     β”‚     β”‚  Training   β”‚           β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚                                               β”‚                  β”‚
β”‚                                               β–Ό                  β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚   β”‚ Monitor │◀────│   Model     │◀────│    Model     β”‚          β”‚
β”‚   β”‚  /Alert β”‚     β”‚   Serving   β”‚     β”‚   Registry   β”‚          β”‚
β”‚   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚        β”‚                                                         β”‚
β”‚        └────────── Retrain Trigger ─────────────────▢ Training  β”‚
β”‚                                                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

5.2 Code Example

"""
Basic MLOps Pipeline Structure
"""

from typing import Dict, Any

class MLOpsPipeline:
    """Basic MLOps Pipeline"""

    def __init__(self, config: Dict[str, Any]):
        self.config = config
        self.experiment_tracker = None
        self.model_registry = None
        self.feature_store = None

    def data_ingestion(self):
        """Data collection and validation"""
        raw_data = self.load_data(self.config["data_source"])
        validated_data = self.validate_data(raw_data)
        return validated_data

    def feature_engineering(self, data):
        """Feature engineering"""
        features = self.feature_store.get_features(
            entity_ids=data["entity_ids"],
            feature_list=self.config["features"]
        )
        return features

    def train(self, features):
        """Model training"""
        with self.experiment_tracker.start_run():
            model = self.train_model(features)
            metrics = self.evaluate(model)
            self.experiment_tracker.log_metrics(metrics)
            return model

    def register(self, model):
        """Model registration"""
        if self.passes_quality_gate(model):
            self.model_registry.register(
                model=model,
                stage="staging"
            )

    def deploy(self, model_version: str):
        """Model deployment"""
        model = self.model_registry.load(model_version)
        self.serving_endpoint.update(model)

    def monitor(self):
        """Model monitoring"""
        if self.detect_drift():
            self.trigger_retraining()

6. Getting Started

6.1 First MLOps Project

"""
Getting Started with MLOps: Experiment Tracking with MLflow
"""

import mlflow
import mlflow.sklearn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# MLflow setup
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("iris-classification")

# Prepare data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

# Start MLflow experiment
with mlflow.start_run(run_name="random-forest-v1"):
    # Log hyperparameters
    params = {"n_estimators": 100, "max_depth": 5}
    mlflow.log_params(params)

    # Train model
    model = RandomForestClassifier(**params, random_state=42)
    model.fit(X_train, y_train)

    # Evaluate and log metrics
    y_pred = model.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    mlflow.log_metric("accuracy", accuracy)

    # Save model
    mlflow.sklearn.log_model(model, "model")

    print(f"Run ID: {mlflow.active_run().info.run_id}")
    print(f"Accuracy: {accuracy:.4f}")

6.2 Local Environment Setup

# Start MLflow server
mlflow server \
    --backend-store-uri sqlite:///mlflow.db \
    --default-artifact-root ./mlruns \
    --host 0.0.0.0 \
    --port 5000

# Check in browser
# http://localhost:5000

Exercises

Exercise 1: MLOps Maturity Assessment

Evaluate your team's current ML process and determine the maturity level.

Exercise 2: Tool Selection

Choose appropriate MLOps tools for the following situations: - Small team, using PyTorch, budget constraints - Large team, multi-framework, high performance requirements


Summary

Concept Description
MLOps Automation of ML model development, deployment, and operation
DevOps vs MLOps MLOps adds data and model management
Maturity Level 0 Manual process, notebook-based
Maturity Level 1 Automated pipeline, experiment tracking
Maturity Level 2 CI/CD/CT, continuous training
Core Tools MLflow, W&B, Kubeflow, Feast, Triton

References

to navigate between lessons