From Dictionary of Visual Words to Subspaces: Locality-constrained Affine Subspace Coding
Peihua Li, Xiaoxiao Lu, Qilong Wang
School of Information and Communication Engineering, Dalian University of Technology
peihuali@dlut.edu.cn, shaw,qlwang@mail.dlut.edu.cn
Abstract: The locality-constrained linear coding (LLC) is a very
successful feature coding method in image classification. It
makes known the importance of locality constraint which
brings high efficiency and local smoothness of the codes.
However, in the LLC method the geometry of feature space
is described by an ensemble of representative points (visual
words) while discarding the geometric structure immediately
surrounding them. Such a dictionary only provides a
crude, piecewise constant approximation of the data manifold.
To approach this problem, we propose a novel feature
coding method called locality-constrained affine subspace
coding (LASC). The data manifold in LASC is characterized
by an ensemble of subspaces attached to the representative
points (or affine subspaces), which can provide a piecewise
linear approximation of the manifold. Given an input descriptor,
we find its top-k neighboring subspaces, in which
the descriptor is linearly decomposed and weighted to form
the first-order LASC vector. Inspired by the success of usage
of higher-order information in image classification, we propose
the second-order LASC vector based on the Fisher information
metric for further performance improvement. We
make experiments on challenging benchmarks and experiments
have shown the LASC method is very competitive. |
Figure 1. Locality-constrained affine subspace coding (LASC). The dictionary of LLC (a) is a set of representative points (visual words); the geometric structure immediately surrounding the words are discarded, and so it only provides a crude, piecewise constant approximation of the manifold. In contrast, the dictionary of LASC (b) is an ensemble of low-dimensional linear subspaces attached to the representative points (affine subspaces), which provides a piecewise linear approximation of the data manifold. For an input feature, we find its top-k nearest subspaces and perform linear decomposition of the feature in these subspaces weighted by the proximity measures. Beyond the linear coding, we propose to leverage the second-order information of the descriptors based on the Fisher information metric. Main results: Paper: Peihua Li, Xiaoxiao Lu, Qilong Wang. From Dictionary of Visual Words to Subspaces: Locality-constrained Affine Subspace Coding. Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2015. [pdf] [bibtex] [code]
|