Document Type

Article

Publication Date

6-2020

Abstract

Human visual perception shows good consistency for many multi-label image classification tasks under certain spatial transforms, such as scaling, rotation, flipping and translation. This has motivated the data augmentation strategy widely used in CNN classifier training -- transformed images are included for training by assuming the same class labels as their original images. In this paper, we further propose the assumption of perceptual consistency of visual attention regions for classification under such transforms, i.e., the attention region for a classification follows the same transform if the input image is spatially transformed. While the attention regions of CNN classifiers can be derived as an attention heatmap in middle layers of the network, we find that their consistency under many transforms are not preserved. To address this problem, we propose a two-branch network with an original image and its transformed image as inputs and introduce a new attention consistency loss that measures the attention heatmap consistency between two branches. This new loss is then combined with multi-label image classification loss for network training. Experiments on three datasets verify the superiority of the proposed network by achieving new state-of-the-art classification performance.

Comments

Recommended Citation

H. Guo, K. Zheng, X. Fan, H. Yu and S. Wang, "Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 729-739, doi: 10.1109/CVPR.2019.00082.

First Page

729

Last Page

739

Publication Title

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

DOI

10.1109/CVPR.2019.00082

Download

Included in

Computer Sciences Commons

COinS

Computer Science Faculty Publications

Visual Attention Consistency under Image Transforms for Multi-Label Image Classification

Document Type

Publication Date

Abstract

Comments

Recommended Citation

First Page

Last Page

Publication Title

DOI

Included in

Browse

Search

Author Corner

Links

Computer Science Faculty Publications

Visual Attention Consistency under Image Transforms for Multi-Label Image Classification

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

First Page

Last Page

Publication Title

DOI

Included in

Share

Browse

Search

Author Corner

Links