Description
#CTO
Summary
- "Everyone wants to do the model work, not the data work": Data quality carries an elevated significance in high-stakes AI due to its heightened downstream impact, impacting predictions like cancer detection, wildlife poaching, and loan allocations.
- Paradoxically, data is the most under-valued and de-glamorised aspect of AI.
- We define, identify, and present empirical evidence on Data Cascades---compounding events causing negative, downstream effects from data issues---triggered by conventional AI/ML practices that undervalue data quality.