Author: Holger Schwenk (Computer Science Department, University of Montreal, Canada)
Abstract: Autoencoders have been studied since many years from a practical and theoretical standpoint, and several applications using their datacompression capabilities, have been proposed in image and speech processing. In this paper we will describe a classification architecture using several autoencoders. The basic idea is to use one autoencoder for each class and to train it only with examples of the corresponding class. Classification of an unknown pattern is down by choosing the best fitting model, e.g. the one with minimal reconstruction error. This classifier is computationally very efficient since it uses a distributed representation of the models, in contrast to other model-based classifiers like k-nn or RBF that use an enumeration of many typical references. Furthermore we need to calculate only one distance measure for each class to recognize. This allows to use complicated distance measure that incorporates a-priori knowledge about the classification task. In this paper we will show that the so called tangent distance can be used during learning and recognition to achieve transformation-invariance of the models. Finally we will present a discriminant learning algorithm. This classifier has achieved state of the art recognition rates on the NIST letters and handwritten digit data base. We will also discuss the properties of this autoencoder based classifier with respect to other classification architectures.