Back to Tracker

MLOps & Deployment

Experiment Tracking · FastAPI · Docker · CI/CD · Cloud ML

Production Module2 Weeks5 LessonsPrepflix AI Roadmap
Experiment Tracking: MLflow
import mlflow import mlflow.sklearn mlflow.set_experiment("fraud-detection-v2") with mlflow.start_run(run_name="xgboost-baseline"): # Log hyperparameters mlflow.log_params({"n_estimators": 200, "max_depth": 6, "lr": 0.1}) model = XGBClassifier(n_estimators=200, max_depth=6, learning_rate=0.1) model.fit(X_train, y_train) # Log metrics mlflow.log_metrics({"auc": roc_auc_score(y_test, model.predict_proba(X_test)[:,1]), "precision": precision_score(y_test, model.predict(X_test))}) # Log the model artifact mlflow.sklearn.log_model(model, "model", registered_model_name="FraudDetector") # Log feature importance plot mlflow.log_figure(fig, "feature_importance.png")
MLflow vs W&B: MLflow is self-hosted and open source (great for enterprises). W&B is cloud-hosted with beautiful dashboards, better for research teams. Both support the same core workflow.
Model Serving: FastAPI
from fastapi import FastAPI from pydantic import BaseModel import joblib import numpy as np app = FastAPI(title="Fraud Detection API", version="1.0") # Load model at startup model = joblib.load("model.pkl") class PredictRequest(BaseModel): amount: float merchant_category: str hour_of_day: int is_international: bool class PredictResponse(BaseModel): fraud_probability: float is_fraud: bool @app.post("/predict", response_model=PredictResponse) async def predict(req: PredictRequest): features = np.array([[req.amount, req.hour_of_day, int(req.is_international)]]) prob = model.predict_proba(features)[0, 1] return PredictResponse(fraud_probability=float(prob), is_fraud=prob > 0.5) @app.get("/health") async def health(): return {"status": "ok"} # Run: uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4
Docker for ML

Dockerfile for ML Service

# Multi-stage build for smaller image FROM python:3.11-slim AS base WORKDIR /app # Install dependencies first (cache layer) COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # Copy app code COPY . . # Non-root user for security RUN useradd --create-home appuser USER appuser EXPOSE 8000 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Essential Docker Commands

# Build image docker build -t fraud-api:v1.0 . # Run container docker run -d -p 8000:8000 --name fraud-api fraud-api:v1.0 # GPU support (ML training) docker run --gpus all -it pytorch/pytorch:latest # Docker Compose for multi-service docker-compose up -d # Push to registry docker tag fraud-api:v1.0 myregistry/fraud-api:v1.0 docker push myregistry/fraud-api:v1.0
.dockerignore: Always exclude __pycache__, .git, *.pyc, data/, and model weights from context to keep builds fast.
CI/CD for ML with GitHub Actions
# .github/workflows/ml-pipeline.yml name: ML Pipeline on: push: branches: [main] jobs: test-and-deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup Python uses: actions/setup-python@v5 with: { python-version: '3.11' } - name: Install deps run: pip install -r requirements.txt - name: Run tests run: pytest tests/ -v --cov=src - name: Train model (if data changed) run: python train.py --output models/model.pkl - name: Evaluate model run: python evaluate.py --min-auc 0.85 # gate on performance! - name: Build Docker image run: docker build -t fraud-api:${{ github.sha }} . - name: Deploy to AWS ECS run: | aws ecs update-service --cluster prod --service fraud-api \ --force-new-deployment
Model Gate: Always add a performance threshold check in CI (e.g., AUC ≥ 0.85). This prevents deploying a worse model after a code change accidentally breaks feature engineering.
Cloud ML Deployment
AWS SageMaker
  • Managed training (spot instances = 70% savings)
  • SageMaker Endpoints for real-time inference
  • Feature Store for ML features
  • Model Monitor for drift detection
  • Pipelines for training automation
GCP Vertex AI
  • AutoML for no-code model training
  • TPU access for large model training
  • Vertex Feature Store
  • Model Registry + Prediction endpoints
  • Pipelines with Kubeflow
Lightweight Alternatives
  • Modal: Serverless GPU inference, pay per ms
  • Hugging Face Spaces: Free model demos
  • Railway/Render: Easy Docker deployment
  • Replicate: Deploy any model with one command