Image Classification using Spatial Layouts derived from 3D Scene Geometry

Author Noha Elfiky
DOI http://wwj

Country : USA
Subject : Geometry

Keywords : Big Data Analytics, Machine Vision, Image Classification and Ob-ject Recognition Tasks, Bag of Words, Spatial Pyramids

Abstract

The Bag-of-Words (BoW) approach has been successfully applied in the context of category-level image classification. To incorporate spatial image information in the BoW model, Spatial Pyramids (SPs) are used. However, spa-tial pyramids are rigid in nature and are based on pre-defined grid configurations. As a consequence, they often fail to coincide with the underlying spatial structure of images from different categories which may negatively affect the classification accuracy.
The aim of the paper is to use the 3D scene geometry to steer the layout of spatial pyramids for category-level image classification (object recognition). The proposed approach provides an image representation by inferring the constituent geometrical parts of a scene. As a result, the image representation retains the de-scriptive spatial information to yield a structural description of the image.
From large scale experiments on the Pascal VOC2007 and Caltech101, it can be derived that SPs which are obtained by selective search outperforms the stand-ard SPs. The use of 3D scene geometry, to select the proper SP configuration, provides an even higher improvement.

Download