Distinguishing Unseen from Seen for Generalized Zero-shot Learning
Paper
Paper/Presentation Title | Distinguishing Unseen from Seen for Generalized Zero-shot Learning |
---|---|
Presentation Type | Paper |
Authors | Su, Hongzu, Li, Jingjing, Chen, Zhi, Zhu, Lei and Lu, Ke |
Journal or Proceedings Title | Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |
Journal Citation | pp. 7875-7884 |
Number of Pages | 10 |
Year | 2022 |
Publisher | IEEE (Institute of Electrical and Electronics Engineers) |
Place of Publication | United States |
ISBN | 9781665469463 |
Digital Object Identifier (DOI) | https://doi.org/10.1109/CVPR52688.2022.00773 |
Web Address (URL) of Paper | https://ieeexplore.ieee.org/document/9879149 |
Web Address (URL) of Conference Proceedings | https://ieeexplore.ieee.org/xpl/conhome/9878378/proceeding |
Conference/Event | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |
Event Details | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Parent Computer Vision and Pattern Recognition (CVPR) Delivery In person Event Date 18 to end of 24 Jun 2022 Event Location New Orleans, LA, United States |
Abstract | Generalized zero-shot learning (GZSL) aims to recognize samples whose categories may not have been seen at training. Recognizing unseen classes as seen ones or vice versa often leads to poor performance in GZSL. Therefore, distinguishing seen and unseen domains is naturally an effective yet challenging solution for GZSL. In this paper, we present a novel method which leverages both visual and semantic modalities to distinguish seen and unseen categories. Specifically, our method deploys two variational autoencoders to generate latent representations for visual and semantic modalities in a shared latent space, in which we align latent representations of both modalities by Wasserstein distance and reconstruct two modalities with the representations of each other. In order to learn a clearer boundary between seen and unseen classes, we propose a two-stage training strategy which takes advantage of seen and unseen semantic descriptions and searches a threshold to separate seen and unseen visual samples. At last, a seen expert and an unseen expert are used for final classification. Extensive experiments on five widely used benchmarks verify that the proposed method can significantly improve the results of GZSL. For instance, our method correctly recognizes more than 99% samples when separating domains and improves the final classification accuracy from 72.6% to 82.9% on AWA1. |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 4602. Artificial intelligence |
Public Notes | The accessible file is the accepted version of the paper. Please refer to the URL for the published version. |
Byline Affiliations | University of Electronic Science and Technology of China, China |
University of Queensland | |
Shandong Normal University, China |
https://research.usq.edu.au/item/zyx39/distinguishing-unseen-from-seen-for-generalized-zero-shot-learning
Download files
Accepted Version
Su_Distinguishing_Unseen_From_Seen_for_Generalized_Zero-Shot_Learning_CVPR_2022_paper.pdf | ||
File access level: Anyone |
11
total views1
total downloads4
views this month1
downloads this month