Ensemble Feature Selection

Choose your feature selection methods:
Feature selections embedded in random forests:

The ensemble feature selection (EFS) app

This app provides an aggregation of eight different feature selection methods for binary classification tasks, named EFS.

After uploading a CSV, XLS or XLSX file and determining the index of the dependend variable, the threshold for maximum proportion of missing values, the correlation threshold (only used by Pearson and Spearman correlation), and the number of runs, a variety of feature selections can be chosen.

The EFS method normalizes each feature selection score to a range of 0 to 1/n, where n is the number of conducted feature selection methods. The scores are sumed of to the EFS-score, which reaches from 0 to 1 (higher values indicate higher importance). It provides a ranking of the parameters, and selects those parameters that have the highest importance.

For datasets with less than 25 features, a barplot of the EFS scores is provided.

Contact Us

Prof. Dr. Dominik Heider

Department of Mathematics & Computer Science

Philipps-Universität Marburg

Hans-Meerwein-Straße 6

D-35032 Marburg, Germany

phone: +49 6421 28 21579

email: dominik.heider@uni-marburg.de


Please cite:

Neumann U, Genze N, Heider D: EFS: An Ensemble Feature Selection Tool implemented as R-package and Web-Application. BioData Mining 2017, 10:21.

If you encounter any problems or bugs, please contact an administrator (ursula.neumann@staff.uni-marburg.de).


How should the input look like?

The input data should be a CSV, XLS or XLSX file with dots as decimal markers and NAs for missing values. The first row contains the variable names.

What will I get as output?

The EFS method provides a ranking of the variables, which consists of their names and scores. If the number of variables does not exceed 25, additionally a barplot is provided.

How can I save my results?

If you want to save your calculations of the EFS App click the 'Download CSV'-Button. You will recive a CSV file with all calculated importances for each feature selection method. EFS importances are the sums of columns.