Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Algorithm

Decision tree induction algorithms are generally what are called greedy algorithms. They are greedy in that they decide on a question to ask then don't consider any more alternatives later on. The algorithms are also dived and conquer because they partition the database into smaller sets. In fact, the general algorithm continually partitions the database into smaller sets until the sets all have the same value for the output variable.

Note that pruning is a mechanism for reducing the variance of the resulting models. However, for large datasets the reduction of variance is not usually useful thus unpruned trees may actually be better.



Copyright © 2004-2006 [email protected]
Support further development through the purchase of the PDF version of the book.
Brought to you by Togaware.