Sensitivity-like analysis for feature selection in genetic programming

Grant Dick; ACM

doi:10.1145/3071178.3071338

Back

Conference proceeding

Sensitivity-like analysis for feature selection in genetic programming

Grant Dick and ACM

Proceedings of the Genetic and Evolutionary Computation Conference, pp.401-408

GECCO '17: Genetic and Evolutionary Computation Conference

ACM Conferences

01/07/2017

DOI: https://doi.org/10.1145/3071178.3071338

Abstract

Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning

Computing methodologies -- Machine learning -- Machine learning algorithms -- Feature selection

Computing methodologies -- Machine learning -- Machine learning approaches -- Bio-inspired approaches -- Genetic programming

Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees

Computing methodologies -- Modeling and simulation -- Model development and analysis -- Model verification and validation

Feature selection is an important process within machine learning problems. Through pressures imposed on models during evolution, genetic programming performs basic feature selection, and so analysis of the evolved models can provide some insights into the utility of input features. Previous work has tended towards a presence model of feature selection, where the frequency of a feature appearing within evolved models is a metric for its utility. In this paper, we identify some drawbacks with using this approach, and instead propose the integration of importance measures for feature selection that measure the influence of a feature within a model. Using sensitivity-like analysis methods inspired by importance measures used in random forest regression, we demonstrate that genetic programming introduces many features into evolved models that have little impact on a given model's behaviour, and this can mask the true importance of salient features. The paper concludes by exploring bloat control methods and adaptive terminal selection methods to influence the identification of useful features within the search performed by genetic programming, with results suggesting that a combination of adaptive terminal selection and bloat control may help to improve generalisation performance.

Metrics

1 Record Views

Details

Record Identifier: 9926549635901891
Title: Sensitivity-like analysis for feature selection in genetic programming
Creators: Grant Dick
ACM
Publication Details: Proceedings of the Genetic and Evolutionary Computation Conference, pp.401-408
Conference: GECCO '17: Genetic and Evolutionary Computation Conference
Academic Unit: Information Science
Publisher: ACM
Date published ; e-published: 01/07/2017
Language: English
Resource Type; Subtype: Conference proceeding

Sensitivity-like analysis for feature selection in genetic programming

Abstract

Related links

Metrics

Details

Usage Policy