Abstract
The bivariate kernel density-ratio estimator has developed popularity among epidemiologists as a flexible exploratory tool for examining the spatial variation in the risk of disease. This estimator is simply given as the quotient of a “case” density estimate describing the observed spatial locations of the disease under scrutiny and a “control” density estimate constructed using observed coordinates of a sample from the uninfected, at-risk population within some bounded geographical region. The way in which the bandwidth(s) for these case and control kernel estimates must be chosen naturally plays a critical role in the interpretation of the resulting risk surface estimate. Recent work has indicated that employment of a particular bandwidth factor for the densities, based on a well-known maximal smoothing principle of the kernel estimator, has the potential to perform well over other bandwidth selectors in terms of estimate proximity to the true (unknown in practice) risk surface. However, calculation of this bandwidth in practice requires specification of a rather arbitrary scalar measure of spread. In this work, we explore the possibility of using triangle dimensions for calculation of the scale, motivated by similar efforts in linear regression variance estimation. The effects on risk surface estimate quality in terms of the integrated squared error are empirically investigated via simulation, and a real-world example illustrates use of the methods in practice.