Weka Select Attribute Algorithm Introduction

The attribute evaluator and search method can be specified in the Select attribute tab.


  • Select attributes usually search for attribute subset space, evaluate each space, which can be achieved by combining attribute subset evaluator and 搜索方法. The
  • fast but less accurate method is to evaluate individual attributes and sort them, discarding attributes below the specified cutoff point, which can be ranked by combining 单属性评价器 and attributes The method is implemented.

1, Attribute Subset Evaluator

Attribute Subset evaluator selects a subset of attributes and returns a metric value that guides the search. The
CfsSubsetEvalevaluator evaluates the predictive power of each attribute and its mutual redundancy, and tends to select attributes that are highly correlated with the category attributes but have low correlation with each other. The option iteratively adds the most relevant attribute to the category attribute as long as the subset does not contain attributes that are more relevant to the current attribute. The

WrapperSubsetEvalevaluator is a wrapper method that uses a classifier to evaluate the set of attributes, which uses cross-validation for each subset to assess the accuracy of the learning plan.

2, single attribute evaluator

InfoGainAttributeEvalThe evaluator evaluates the attribute by measuring the attribute information gain corresponding to the category. It first uses the discretization method based on MDL (minimum description length) (also can be set For the binarization process) discretize the numeric properties. The properties are evaluated by measuring the gain rate of the corresponding category.

3, Search Method

The search method traverses the attribute space to search for a good subset and measures its quality through the selected attribute subset evaluator.
BestFirstThe search method performs the greedy hill climbing method with backtracking. It can start from the empty attribute set and search backwards from the full set. It can also start the two-way search from the intermediate point (specified by the attribute index list) and consider all possible additions and deletions of individual attributes.

GreedyStepwiseThe search method greedily searches the subset space of the property. Like the BestFirst search method, it can search forward and backward. However, it does not backtrack. As soon as the addition or deletion of the remaining best attributes results in a decrease in the evaluation indicator, it is terminated immediately.

RankerIt's not actually a way to search for a subset of attributes, but a way to rank individual attributes. By sorting attributes for a single attribute evaluation, only a single attribute evaluator can be used, not a user attribute subset evaluator.