Authors: Todd Eavis and Nathalie Japkowicz
Abstract: Though impressive classification accuracy is often obtained via discrimination-based learning techniques such as Multi-Layer Perceptrons (MLP), such techniques often assume that the underlying training sets are optimally balanced (in terms of number of positive and negative examples). Unfortunately, this is not always the case. In this paper, we look at a recognition-based approach whose accuracy is superior to that obtained via more conventional mechanisms in such environments. At the heart of the new technique is the incorporation of a recognition component to the conventional MLP mechanism through the use of a modified autoassociator. In short, rather than being associated with an output value of 1, each positive example is fully reconstructed at the output layer while, rather than being associated with an output value of 0, each negative example has its inverse derived at the output layer. The result is an auto-associator able to recognize positive examples while discriminating against negative ones by virtue of the fact that negative cases generate larger reconstruction errors than positive ones. A simple training technique is employed to exaggerate the impact of training with these negative examples so that reconstruction error boundaries can be more easily and reliably established. Preliminary testing on a seismic data set has demonstrated that the new method consistently produces very low error rates in imbalanced settings and is as accurate as standard methods in balanced ones. Our approach thus suggests a simple but more robust alternative to commonly used classification mechanisms.