Playing possum with knowledge discovery: inducing population density models from spatially referenced ecological data
Aldridge, Colin H
The author’s rough set based knowledge induction methodology (Aldridge 1998) and the C4.5 decision tree algorithm (Quinlan 1993) are applied to spatially referenced ecological data to develop models making use of spatial relationships. The data set records the habitat characteristics and the population distribution of a species of Australian arboreal marsupial, Peteroides volans, commonly known as the greater glider possum. The study area is 1600 hectares of south-eastern Australian coastal ranges and tableland. The ability of the possums to glide between trees suggested that identifying spatial relationships between habitat variables might be important when predicting local population density. A variable-sized “context window” is used in the search for spatial relationships. Other researchers using several machine learning methods to generate both spatial and non-spatial models have already studied the greater glider dataset. The results of these earlier studies are used as benchmarks against which to compare the results of the present work. The models developed using the rough set based algorithms are found to be statistically indistinguishable in terms of classification accuracy from the best of the models reported in the literature.
Conference: 15th Annual Colloquium of the Spatial Information Research Centre (SIRC 2003: Land, Place and Space), Dunedin, New Zealand
Keywords: rough set theory; C5.0; geographic knowledge; GIS; Data mining
Research Type: Conference or Workshop Item (Paper)