Description
Having columns of data with high cardinality can adversely affect the performance of your models. The idea of this article stemmed from my personal experience of employing tree based solutions in…
Summary
- How high cardinality affects your CART performance and interpretation Having columns of data with high cardinality can adversely affect the performance of your models.
- By extracting all the nodes where occupation was used for splitting the node, we can see at which points data was used for splitting from the table on the right.
- 0.5, 2.5, 3.5, 5.5, 8.5, 9.5, 10.5 Mapping it to the encoding table shows the decision boundaries that has been used throughout the tree to classify the data points.
- Try different methods of encoding.