通过6讲详细学习决策树模型
课程介绍
- Tree based methods
- Ensembling techniques
- Resource
决策树模型
A decision tree is a series of yes/no questions about your input that you ask in sequence
models are learned greedily; they recursively pick a spilt in each node that maximizes the increase in “purity”
When to End:
- maximum depth
- maximum leaves
- few data point in a particular leaf
- reach purity
Impurity
key of decision tree process is to check the impurity in its child nodes after question. we want high purity = one kind after the split node
Two choices for impurity
Entropy and Gini are extremely similiar, but logarithm is complex to compute, gini index is most commonly recommended
- Entropy
- Gini