Mitigating Generation Shifts for Generalized Zero-Shot Learning
Paper
Paper/Presentation Title | Mitigating Generation Shifts for Generalized Zero-Shot Learning |
---|---|
Presentation Type | Paper |
Authors | Chen, Zhi, Luo, Yadan, Wang, Sen, Qui, Ruihong, Li, Jingjing and Huang, Zi |
Journal or Proceedings Title | Proceedings of the 29th ACM International Conference on Multimedia (MM ’21) |
Journal Citation | pp. 844-852 |
Number of Pages | 9 |
Year | 2021 |
Publisher | Association for Computing Machinery (ACM) |
Place of Publication | United States |
ISBN | 9781450386517 |
Digital Object Identifier (DOI) | https://doi.org/10.1145/3474085.3475258 |
Web Address (URL) of Paper | https://dl.acm.org/doi/10.1145/3474085.3475258 |
Web Address (URL) of Conference Proceedings | https://dl.acm.org/doi/proceedings/10.1145/3474085 |
Conference/Event | 29th ACM International Conference on Multimedia (MM '21) |
Event Details | 29th ACM International Conference on Multimedia (MM '21) Parent ACM International Conference on Multimedia Delivery Online Event Date 20 to end of 24 Oct 2021 |
Abstract | Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information to recognize seen and unseen samples, where unseen classes are not observable during training. It is natural to derive generative models and hallucinate training samples for unseen classes based on the knowledge learned from the seen samples. However, most of these models suffer from the generation shifts, where the synthesized samples may drift from the real distribution of unseen data. In this paper, we propose a novel generative flow framework that consists of multiple conditional affine coupling layers for learning unseen data generation. In particular, we identify three potential problems that trigger the generation shifts, i.e., semantic inconsistency, variance collapse, and structure disorder and address them respectively. First, to reinforce the correlations between the generated samples and their corresponding attributes, we explicitly embed the semantic information into the transformations in each coupling layer. Second, to recover the intrinsic variance of the real unseen features, we introduce a visual perturbation strategy to diversify the generated data and hereby help adjust the decision boundary of the classifiers. Third, a relative positioning strategy is proposed to revise the attribute embeddings, guiding them to fully preserve the inter-class geometric structure and further avoid structure disorder in the semantic space. Experimental results demonstrate that GSMFlow achieves the state-of-the-art performance on GZSL. |
Keywords | Generalized zero-shot learning; conditional generative flows |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 4602. Artificial intelligence |
Public Notes | © 2021 Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in MM '21: Proceedings of the 29th ACM International Conference on Multimedia, https://doi.org/10.1145/3474085.3475258. |
Byline Affiliations | University of Queensland |
University of Electronic Science and Technology of China, China |
https://research.usq.edu.au/item/zyx21/mitigating-generation-shifts-for-generalized-zero-shot-learning
Download files
8
total views3
total downloads4
views this month1
downloads this month