Decision Tree Options


Read more about Data Workbench’s End-of-life announcement.

The Decision Tree menu includes features to set the positive use case, filters, leaf distribution options, confusion matrix, and other advanced options.

Toolbar buttons Description
Go Click to run the decision tree algorithm and display the visualization. This is grayed-out until there are inputs.
Reset Clears inputs and decision tree model and resets the process.
Save Save the Decision Tree. You can save the Decision Tree in different formats:
  • Predictive Markup Language (PMML), an XML-based file format used by applications to describe and exchange decision tree models.
  • Text displaying simple columns and rows of true or false, percentages, number of members, and input values.
  • A Dimension with branches corresponding to predicted outcome elements.
Options See table below for Options menu.
Options menu Description
Set Positive Case Defines the current workspace selection as the model's Positive Case. Clears the case if no selection exists.
Set Population Filter Defines the current workspace selection as the model's Population Filter and will be drawn from visitors who satisfy this condition. The default is "Everyone."
Show Complex Filter Description Displays descriptions of the defined filters. Click to view the filtering scripts for the Positive Case and Population Filter.
Hide Nodes Hides nodes with only a small percentage of the population. This menu command displays only when the decision tree is displayed.
Confusion Matrix

Click Options > Confusion Matrix to view the Accuracy, Recall, Precision and F-Score values. The closer to 100 percent, the better the score.

The Confusion Matrix gives four counts of accuracy of the model using a combination of values:

  • Actual Positive (AP)
  • Predicted Positive (PP)
  • Actual Negative (AN)
  • Predicted Negative (PN)

Tip: These numbers are obtained by applying the resulted scoring model of the 20 percent testing data withheld and already known as the true answer. If the score is greater than 50 percent, it is predicted as a positive case (that matches the defined filter). Then, Accuracy = (TP + TN)/(TP + FP + TN + FN), Recall = TP / (TP + FN), and Precision = TP / (TP + FP).

Display Legend Allows you to toggle a legend key on and off in the Decision Tree. This menu command displays only when the decision tree is displayed.
Advanced Click to open Advanced menu for in-depth use of Decision Tree. See table below for menu options.
Advanced menu Description
Training Set Size

Controls the size of the training set used for the model building. Larger sets take longer to train, smaller sets take less time.

Input Normalization

Allows the user to specify whether to use the Min-Max or the Z Score technique to normalize inputs into the model.

SMOTE Over-Sampling Factor When the Positive Case does not occur very often (less than 10 percent) in the training sample, SMOTE is used to provide additional samples. This option allows the user to indicate how many more samples to create using SMOTE.
Leaf Class Distribution Threshold Allows you to set the threshold assumed for a leaf during the tree building process. By default, all members of a node must be identical for it to be a leaf (prior to pruning stage).

On this page