Decoupled Progressive Distillation for Sequential Prediction with Interaction Dynamics
Article
Hu, Kaixi, Li, Lin, Xie, Qing, Liu, Jianquan, Tao, Xiaohui and Xu, Guandong. 2024. "Decoupled Progressive Distillation for Sequential Prediction with Interaction Dynamics." ACM Transactions on Information Systems. 42 (3), pp. 1-35. https://doi.org/10.1145/3632403
Article Title | Decoupled Progressive Distillation for Sequential Prediction with Interaction Dynamics |
---|---|
ERA Journal ID | 36115 |
Article Category | Article |
Authors | Hu, Kaixi, Li, Lin, Xie, Qing, Liu, Jianquan, Tao, Xiaohui and Xu, Guandong |
Journal Title | ACM Transactions on Information Systems |
Journal Citation | 42 (3), pp. 1-35 |
Article Number | 72 |
Number of Pages | 35 |
Year | 2024 |
ISSN | 1046-8188 |
1558-2868 | |
Digital Object Identifier (DOI) | https://doi.org/10.1145/3632403 |
Web Address (URL) | https://dl.acm.org/doi/10.1145/3632403 |
Abstract | Sequential prediction has great value for resource allocation due to its capability in analyzing intents for next prediction. A fundamental challenge arises from real-world interaction dynamics where similar sequences involving multiple intents may exhibit different next items. More importantly, the character of volume candidate items in sequential prediction may amplify such dynamics, making deep networks hard to capture comprehensive intents. This article presents a sequential prediction framework with Decoupled Progressive Distillation (DePoD), drawing on the progressive nature of human cognition. We redefine target and non-target item distillation according to their different effects in the decoupled formulation. This can be achieved through two aspects: (1) Regarding how to learn, our target item distillation with progressive difficulty increases the contribution of low-confidence samples in the later training phase while keeping high-confidence samples in the earlier phase. And, the non-target item distillation starts from a small subset of non-target items from which size increases according to the item frequency. (2) Regarding whom to learn from, a difference evaluator is utilized to progressively select an expert that provides informative knowledge among items from the cohort of peers. Extensive experiments on four public datasets show DePoD outperforms state-of-the-art methods in terms of accuracy-based metrics. |
Keywords | Sequential prediction; representation learning; interaction dynamics; knowledge distillation |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 460206. Knowledge representation and reasoning |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Byline Affiliations | University of Technology Sydney |
Wuhan University of Technology, China | |
NEC Corporation, Japan | |
School of Mathematics, Physics and Computing |
Permalink -
https://research.usq.edu.au/item/z8658/decoupled-progressive-distillation-for-sequential-prediction-with-interaction-dynamics
28
total views0
total downloads0
views this month0
downloads this month