Operations: The eternal punishment¶

The afterlife, where your model’s sins haunt you in perpetuity. This is where you discover that “deploying the model” was just the tutorial level.

Monitoring: The dashboard of lies¶

You’ll track drift, bias, and performance metrics on a beautiful Grafana dashboard—which everyone ignores until a VP asks “Why is the model racist?” at a board meeting. Alerts will fire at 2 a.m. for “statistically significant drift,” which turns out to be a single outlier from a botched data pipeline. You’ll mute the alerts. This is how disasters happen.

Retraining: The snake eating its tail¶

“Continual learning” sounds elegant. In practice, it means cron jobs that trigger retraining pipelines, which fail silently because the feature store schema changed. The model degrades, the business panics, and you’re left explaining why “AI” can’t handle a column rename. Meanwhile, the data engineers blame “lack of documentation,” as if anyone in ML has ever documented anything.

The security farce¶

You’ll “secure the ML platform” by rotating credentials quarterly (read: when forced). The data pipelines will leak PII because someone hardcoded a S3 bucket name. The compliance team will demand “explainability,” so you’ll tack on SHAP values post-hoc and pray no one asks follow-ups.

The final horror: legacy¶

One day, you’ll realise your “cutting-edge MLOps stack” is now legacy tech. The startup that built your feature store pivoted to NFTs. The new hires refuse to touch the “old” codebase. You’ll argue for a rewrite, but the business insists “if it’s not broken, don’t fix it”—right up until it breaks spectacularly.

Last update: 2025-08-04 08:32