Description
Teams struggle to define how to monitor their AI data and models in production. We share learning regarding MLOps for Deep Learning and Machine Learning.
Summary
- Learning from our work creating production visibility for teams across deep learning and machine learning use cases AI teams across verticals vehemently agree that their data and models must be monitored in production.
- It could be seconds after the models run (e.g., a user clicking on a recommended ad), but also weeks after the models run (e.g., the merchant notifies the fraud system about real fraudulent transactions) for the “ground truth” feedback to be available.
- Overall, anomalies in metrics created based on outputs tell the team that something is happening.
- In our experience, such monitoring strategy includes defining model performance metrics (e.g., precision, AUC/ROC and others) using data available at the inference stage or even later, establishing granular behavioral metrics of model outputs, tracking feature behavior individually and as a set, and collecting metadata which could assist in segmenting metric behavior.