TracIn — A Simple Method to Estimate Training Data Influence

By Google AI Blog
The quality of a machine learning (ML) model’s training data can have a significant impact on its performance. Computing Top Influence Examples We illustrate the utility of TracIn by first calculating the loss gradient vector for some training data and a test example for a specific classification — an image of a chameleon — and then leveraging a standard k-nearest neighbors library to retrieve the top proponents and opponents. Most similar and dissimilar examples of embedding vectors from the penultimate layer. Identifying Outliers with Self-Influence Finally, we can also use TracIn to identify outliers that exhibit a high self-influence, i.e., the influence of a training point on its own prediction.

 

Topics

  1. Machine_Learning (0.37)
  2. Backend (0.36)
  3. NLP (0.21)

Similar Articles