HYPERSPECTRAL DATA CLASSIFICATION USING FACTOR GRAPHS
Keywords: Hyper spectral, Classification, Training, Reference Data
Abstract. Accurate classification of hyperspectral data is still a competitive task and new classification methods are developed to achieve desired tasks of hyperspectral data use. The objective of this paper is to develop a new method for hyperspectral data classification ensuring the classification model properties like transferability, generalization, probabilistic interpretation, etc. While factor graphs (undirected graphical models) are unfortunately not widely employed in remote sensing tasks, these models possess important properties such as representation of complex systems to model estimation/decision making tasks.
In this paper we present a new method for hyperspectral data classification using factor graphs. Factor graph (a bipartite graph consisting of variables and factor vertices) allows factorization of a more complex function leading to definition of variables (employed to store input data), latent variables (allow to bridge abstract class to data), and factors (defining prior probabilities for spectral features and abstract classes; input data mapping to spectral features mixture and further bridging of the mixture to an abstract class). Latent variables play an important role by defining two-level mapping of the input spectral features to a class. Configuration (learning) on training data of the model allows calculating a parameter set for the model to bridge the input data to a class.
The classification algorithm is as follows. Spectral bands are separately pre-processed (unsupervised clustering is used) to be defined on a finite domain (alphabet) leading to a representation of the data on multinomial distribution. The represented hyperspectral data is used as input evidence (evidence vector is selected pixelwise) in a configured factor graph and an inference is run resulting in the posterior probability. Variational inference (Mean field) allows to obtain plausible results with a low calculation time. Calculating the posterior probability for each class and comparison of the probabilities leads to classification. Since the factor graphs operate on input data represented on an alphabet (the represented data transferred into multinomial distribution) the number of training samples can be relatively low.
Classification assessment on Salinas hyperspectral data benchmark allowed to obtain a competitive accuracy of classification. Employment of training data consisting of 20 randomly selected points for a class allowed to obtain the overall classification accuracy equal to 85.32% and Kappa equal to 0.8358. Representation of input data on a finite domain discards the curse of dimensionality problem allowing to use large hyperspectral data with a moderately high number of bands.