Machine Learning – Flow Anomaly Detection

Runtime Inference: model_engine.py  |  Used by: ryu_project.py (after MUD pre-check)  |  Retraining: retrain_from_logs.py

1) Overview

The ML engine classifies flows as benign or malicious in real time. It is invoked only for flows that are not outright denied by the MUD baseline. Decisions are fused with MUD verdicts and a PageRank-based trust score (see Architecture) before programming OpenFlow rules.

2) Data & Labeling

Training data is produced by the controller logs and test harness:

2.1 Log schema (per row)

# flows_log.csv (union of multiple files allowed)
      timestamp, device_id, src_ip, dst_ip, proto, src_port, dst_port, bytes, pkts, duration,
      inter_arrival_mean, inter_arrival_std, conn_attempts_window, port_rarity,
      mud_verdict, ml_label, ml_score, trust_score, final_decision, ground_truth_label
        

3) Features

Computed in ryu_project.py before inference:

Preprocessing: type casts, missing-value imputation, scaling where relevant; categorical encodings kept in the model pipeline.

4) Online Inference API (model_engine.py)

from model_engine import classify_flow

      features = {
        "proto": 6, "src_port": 51524, "dst_port": 443,
        "pkt_len": 1180, "bytes": 9216, "pkts": 8, "duration": 1.2,
        "bpp": 1152.0, "pps": 6.7,
        "inter_arrival_mean": 0.18, "inter_arrival_std": 0.05,
        "port_rarity": 0.02, "conn_attempts_window": 1
      }

      label, score = classify_flow(features)  # e.g., ("benign", 0.08)
        

4.1 Controller call site (ryu_project.py, conceptual)

if mud_verdict == "DENY":
          decision = "DROP"
      else:
          label, score = classify_flow(features)
          # Fuse with trust score and MUD result
          decision = fuse(mud_verdict, label, score, trust_score)
          program_switch(decision, flow_spec)
        

5) Training & Retraining (retrain_from_logs.py)

The model can be retrained from accumulated CSV logs (single file or glob). The script handles loading, feature engineering, train/val/test split, class imbalance, cross-validation, and persistence.

5.1 Usage

# Train from multiple logs and export model + report
      python retrain_from_logs.py
        

5.2 What it does

5.3 Hot-swap in production

# point the runtime to the new model (no controller restart if you reload safely)
      export ML_MODEL_PATH=models/rf_model.pkl
      # or set in config JSON and trigger a reload endpoint (if exposed)
        

6) Evaluation & Thresholds

{
        "accuracy": 0.964,
        "precision": {"benign": 0.97, "malicious": 0.95},
        "recall":    {"benign": 0.96, "malicious": 0.97},
        "f1":        {"benign": 0.96, "malicious": 0.96},
        "auc": 0.987,
        "threshold": 0.70
      }
        

7) Drift Detection & Retrain Policy

8) Decision Fusion (MUD ∧ ML ∧ Trust)

// conceptual
      if MUD == "DENY":
        decision = DROP
      elif ML_score >= 0.9:
        decision = DROP
      elif ML_score >= 0.7 and Trust < 0.2:
        decision = QUARANTINE
      elif MUD == "ALLOW" and ML_label == "benign" and Trust > 0.4:
        decision = ALLOW
      else:
        decision = RATE_LIMIT
        

9) Reproducibility

10) Performance & Safety

11) Quick Commands

# 1) Retrain
                      python retrain_from_logs.py 
  # 2) Run controller (uses new model if configured)
                    ryu-manager ryu_project.py --observe-links

  # 3) Generate mixed traffic in Mininet topology, then inspect UI pages:
  #    - Demo: live events
  #    - Results & Evaluation: metrics, confusion matrix, FP/FN trend
        

12) See Also