DATA MINING
Desktop Survival Guide
by Graham Williams

Two Class Models

This chapter focuses on the common data mining task of binary (or two class) classification. This is the task of distinguishing between two classes of entities - whether they be high risk and low risk insurance clients, productive and unproductive audits, responsive and non-responsive customers, successful and unsuccessful security breaches, and many other similar examples.

Rattle provides a straight-forward interface to the common collection of model builders used in data mining. For each, a basic collection of the commonly used tuning parameters is exposed through the interface for fine tuning the model performance. Where possible, Rattle attempts to present good default values to allow the user to simply build a model with no or little tuning. This may not always be the right approach, but is certainly a good place to start.

The model builders provided by are: Decision Trees, Boosted Decision Trees, Random Forests, Support Vector Machines, and Logistic Regression.

Subsections

Support further development through the purchase of the PDF version of the book.
Brought to you by Togaware.