YOLO-SA: An Efficient Object Detection Model Based on Self-attention Mechanism

Paper


Li, Ang, Song, Xiangyu, Sun, ShiJie, Zhang, Zhaoyang, Cai, Taotao and Song, Huansheng. 2024. "YOLO-SA: An Efficient Object Detection Model Based on Self-attention Mechanism." 7th International Joint Conference on Asia-Pacific Web and Web-Age Information Management (APWeb-WAIM 2023). Wuhan, China 06 - 08 Oct 2023 Singapore. Springer. https://doi.org/10.1007/978-981-97-2421-5_1
Paper/Presentation Title

YOLO-SA: An Efficient Object Detection Model Based on Self-attention Mechanism

Presentation TypePaper
AuthorsLi, Ang, Song, Xiangyu, Sun, ShiJie, Zhang, Zhaoyang, Cai, Taotao and Song, Huansheng
Journal or Proceedings TitleProceedings of the 7th International Joint Conference on Asia-Pacific Web and Web-Age Information Management (APWeb-WAIM 2023)
Journal Citation14334, pp. 1-15
Number of Pages15
Year2024
PublisherSpringer
Place of PublicationSingapore
ISBN9789819724215
9789819724208
Digital Object Identifier (DOI)https://doi.org/10.1007/978-981-97-2421-5_1
Web Address (URL) of Paperhttps://link.springer.com/chapter/10.1007/978-981-97-2421-5_1
Web Address (URL) of Conference Proceedingshttps://link.springer.com/book/10.1007/978-981-97-2421-5
Conference/Event7th International Joint Conference on Asia-Pacific Web and Web-Age Information Management (APWeb-WAIM 2023)
Event Details
7th International Joint Conference on Asia-Pacific Web and Web-Age Information Management (APWeb-WAIM 2023)
Parent
Joint International Conference on Asia-Pacific Web Conference (APWeb)/Web-Age Information Management (WAIM)
Delivery
In person
Event Date
06 to end of 08 Oct 2023
Event Location
Wuhan, China
Abstract

Object detector based on CNN structure has been widely used in object detection, object classification and other tasks. The traditional CNN module usually adopts complex multi-branch design, which reduces the reasoning speed and memory utilization. Moreover, in many works, attention mechanism is usually added to the object detector to extract rich features in spatial information, which are usually used as additional modules of convolution without fundamental improvement from the limitations of convolution operation. Finally, traditional object detectors often have coupled detection heads, which can compromise model performance. To solve the above problems, we propose a new object detection model, YOLO-SA, based on the current popular object detector model YOLOv5. We introduce a new reparameterized module RepVGG to replace the original DarkNet53 structure of YOLOv5 model, which greatly reduces the complexity of the model and improves the detection accuracy. We introduce a self-attention mechanism module in the feature fusion part of the model, which is independent from other convolutional layers and has higher performance than other mainstream attention mechanism modules. We replace the coupled detection head in YOLOv5 model with an anchor-based decoupled detection head, which greatly improved the convergence speed in the training process. Experiments show that the detection accuracy of the YOLO-SA model proposed by us reaches 71.2% and 75.8% on COCO2014 and VOC2012 dataset respectively, which is superior to the YOLOv5s model as the baseline and other mainstream object detection models, showing certain superiority.

KeywordsObject detection; CNN architecture ; Attention mechanis; Decoupled detection head
Contains Sensitive ContentDoes not contain sensitive content
ANZSRC Field of Research 2020460299. Artificial intelligence not elsewhere classified
Public Notes

Files associated with this item cannot be displayed due to copyright restrictions.

SeriesLecture Notes in Computer Science
Byline AffiliationsChang'an University, China
Swinburne University of Technology
Macquarie University
Permalink -

https://research.usq.edu.au/item/z9v20/yolo-sa-an-efficient-object-detection-model-based-on-self-attention-mechanism

  • 16
    total views
  • 0
    total downloads
  • 1
    views this month
  • 0
    downloads this month

Export as

Related outputs

A Survey on Truth Discovery: Concepts, Methods, Applications, and Opportunities
Wang, Shuang, Zhang, He, Sheng, Quan Z., Li, Xiaoping, Sun, Zhu, Cai, Taotao, Zhang, Wei Emma, Yang, Jian and Gao, Qing. 2024. "A Survey on Truth Discovery: Concepts, Methods, Applications, and Opportunities." IEEE Transactions on Big Data. https://doi.org/10.1109/TBDATA.2024.3423677
ECS-STPM: An Efficient Model for Tunnel Fire Anomaly Detection
Song, Huansheng, Wen, Ya, Song, Xiangyu, Sun, ShiJie, Cai, Taotao and Li, Jianxin. 2024. "ECS-STPM: An Efficient Model for Tunnel Fire Anomaly Detection." 7th International Joint Conference on Asia-Pacific Web and Web-Age Information Management (APWeb-WAIM 2023). Wuhan, China 06 - 08 Oct 2023 Singapore . Springer. https://doi.org/10.1007/978-981-97-2421-5_19
MDCGA-Net: Multi-Scale Direction Context-Aware Network with Global Attention for Building Extraction from Remote Sensing Images
Niu, Penghui, Gu, Junhua, Zhang, Yajuan, Zhang, Ping, Cai, Taotao, Xu, Wenjia and Han, Jungong. 2024. "MDCGA-Net: Multi-Scale Direction Context-Aware Network with Global Attention for Building Extraction from Remote Sensing Images." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 17, pp. 8461-8476. https://doi.org/10.1109/JSTARS.2024.3387969
Special issue on link prediction in complex networks
Sheng, Michael, Cai, Taotao and Mahmood, Adnan. 2024. "Special issue on link prediction in complex networks." Computing. 106 (7), pp. 2079-2079. https://doi.org/10.1007/s00607-024-01298-7
Optimal Treatment Strategies for Critical Patients with Deep Reinforcement Learning
Job, Simi, Tao, Xiaohui, Li, Lin, Xie, Haoran, Cai, Taotao, Yong, Jianming and Li, Qing. 2024. "Optimal Treatment Strategies for Critical Patients with Deep Reinforcement Learning ." ACM Transactions on Intelligent Systems and Technology. 15 (2), pp. 1-22. https://doi.org/10.1145/3643856
Robust equivalent circuit model parameters identification scheme for State of Charge (SOC) estimation based on maximum correntropy criterion
Zhang, Kexin, Zhao, Xuezhuan, Chen, Yu, Wu, Di, Cai, Taotao, Wang, Yi, Li, Lingling and Zhang, Ji. 2024. "Robust equivalent circuit model parameters identification scheme for State of Charge (SOC) estimation based on maximum correntropy criterion." International Journal of Electrochemical Science. 19 (5). https://doi.org/10.1016/j.ijoes.2024.100558
FRAMU: Attention-based Machine Unlearning using Federated Reinforcement Learning
Shaik, Thanveer, Tao, Xiaohui, Li, Lin, Xie, Haoran, Cai, Taotao, Zhu, Xiaofeng and Li, Qing. 2024. "FRAMU: Attention-based Machine Unlearning using Federated Reinforcement Learning ." IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2024.3382726
Reconnecting the Estranged Relationships: Optimizing the Influence Propagation in Evolving Networks
Cai, Taotao, Lei, Qi, Sheng, Quan Z., Cui, Ningning, Yang, Shuiqiao, Yang, Jian, Zhang, Wei Emma and Mahmood, Adnan. 2024. "Reconnecting the Estranged Relationships: Optimizing the Influence Propagation in Evolving Networks." IEEE Transactions on Knowledge and Data Engineering. 36 (5), pp. 2151-2165. https://doi.org/10.1109/TKDE.2023.3316268
Dynamic Correlation Adjacency Matrix Based Graph Neural Network for Traffic Flow Prediction
Gu, Junhua, Jia, Zhihao, Cai, Taotao, Song, Xiangyu and Mahmood, Adnan. 2023. "Dynamic Correlation Adjacency Matrix Based Graph Neural Network for Traffic Flow Prediction." Sensors. 23 (6). https://doi.org/10.3390/s23062897
Top-k socio-spatial co-engaged location selection for social users
Hasan Haldar, Nur Al, Li, Jianxin, Ali, Mohammed Eunus, Cai, Taotao, Chen, Yunliang, Sellis, Timos and Reynolds, Mark. 2023. "Top-k socio-spatial co-engaged location selection for social users." IEEE Transactions on Knowledge and Data Engineering. 35 (5), pp. 5325-5340. https://doi.org/10.1109/TKDE.2022.3151095
Towards Multi-User, Secure, and Verifiable kNN Query in Cloud Database
Cui, Ningning, Qian, Kang, Cai, Taotao, Li, Jianxin, Yang, Xiaochun, Cui, Jie and Zhong, Hong. 2023. "Towards Multi-User, Secure, and Verifiable kNN Query in Cloud Database." IEEE Transactions on Knowledge and Data Engineering. 35 (9), pp. 9333-9349. https://doi.org/10.1109/TKDE.2023.3237879
Incremental graph computation: Anchored Vertex Tracking in Dynamic Social Networks
Cai, Taotao, Yang, Shuiqiao, Li, Jianxin, Sheng, Quan Z., Yang, Jian, Wang, Xin, Zhang, Wei Emma and Gao, Longxiang. 2023. "Incremental graph computation: Anchored Vertex Tracking in Dynamic Social Networks." IEEE Transactions on Knowledge and Data Engineering. 35 (7), pp. 7030-7044. https://doi.org/10.1109/TKDE.2022.3199494
Robust cross-network node classification via constrained graph mutual information
Yang, Shuiqiao, Cai, Borui, Cai, Taotao, Song, Xiangyu, Jiang, Jiaojiao, Li, Bing and Li, Jianxin. 2022. "Robust cross-network node classification via constrained graph mutual information." Knowledge-Based Systems. 257. https://doi.org/10.1016/j.knosys.2022.109852
A survey on deep learning based knowledge tracing
Song, Xiangyu, Li, Jianxin, Cai, Taotao, Yang, Shuiqiao, Yang, Tingting and Liu, Chengfei. 2022. "A survey on deep learning based knowledge tracing." Knowledge-Based Systems. 258. https://doi.org/10.1016/j.knosys.2022.110036
Target-Aware Holistic Influence Maximization in Spatial Social Networks
Cai, Taotao, Li, Jianxin, Mian, Ajmal, Li, Rong-Hua, Sellis, Timos and Yu, Jeffrey Xu. 2022. "Target-Aware Holistic Influence Maximization in Spatial Social Networks ." IEEE Transactions on Knowledge and Data Engineering. 34 (4), pp. 1993-2007. https://doi.org/10.1109/TKDE.2020.3003047
Community-diversity Driven Influence Maximization on Social Networks
Li, Jianxin, Cai, Taotao, Ke, Deng, Wang, Xinjue, Sellis, Timos and Xia, Feng. 2020. "Community-diversity Driven Influence Maximization on Social Networks." Information Systems. 92. https://doi.org/10.1016/j.is.2020.101522
Anchor vertex selection for enhanced reliability of traffic offloading service in edge-enabled mobile P2P social networks
Zhang, Hengda, Wang, Xiaofei, Fan, Hao, Cai, Taotao, Li, Jianxin, Li, Xiuhua and Leung, Victor C. M.. 2020. "Anchor vertex selection for enhanced reliability of traffic offloading service in edge-enabled mobile P2P social networks." Journal of Communications and Information Networks. 5 (2), pp. 217-224. https://doi.org/10.23919/JCIN.2020.9130437
Anchored Vertex Exploration for Community Engagement in Social Networks
Cai, Taotao, Li, Jianxin, Hasan Haldar, Nur Al, Mian, Ajmal, Yearwood, John and Sellis, Timos. 2020. "Anchored Vertex Exploration for Community Engagement in Social Networks ." 2020 IEEE 36th International Conference on Data Engineering (ICDE). Dallas, United States 20 - 24 Apr 2020 United States. IEEE (Institute of Electrical and Electronics Engineers). https://doi.org/10.1109/ICDE48307.2020.00042
Correlate Influential News Article Events to Stock Quote Movement
Mandalapu, Arun Chaitanya, Gunabalan, Saranya, Sadineni, Avinash, Cai, Taotao, Hasan, Nur Al Hasan and Li, Jianxin. 2019. "Correlate Influential News Article Events to Stock Quote Movement ." Li, Jianxin, Wang, Sen, Qin, Shaowen, Li, Xue and Wang, Shuliang (ed.) 15th International Conference on Advanced Data Mining and Applications. Dalian, China 21 - 23 Nov 2019 Switzerland. Springer. https://doi.org/10.1007/978-3-030-35231-8_24
Holistic Influence Maximization for Targeted Advertisements in Spatial Social Networks
Li, Jianxin, Cai, Taotao, Mian, Ajmal, Li, Rong-Hua, Sellis, Timos and Yu, Jeffrey Xu. 2018. "Holistic Influence Maximization for Targeted Advertisements in Spatial Social Networks ." 2018 IEEE 34th International Conference on Data Engineering (ICDE). Paris, France 16 - 19 Apr 2018 United States. IEEE (Institute of Electrical and Electronics Engineers). https://doi.org/10.1109/ICDE.2018.00145
Efficient Distance-based Representative Skyline Computation in 2D Space
Mao, Rui, Cai, Taotao, Li, Rong-Hua, Yu, Jeffery Xu and Li, Jianxin. 2017. "Efficient Distance-based Representative Skyline Computation in 2D Space." World Wide Web. 20 (4), pp. 621-638. https://doi.org/10.1007/s11280-016-0406-0
Efficient Algorithms for Distance-Based Representative Skyline Computation in 2D Space
Cai, Taotao, Li, Rong-Hua, Yu, Jeffrey Xu, Mao, Rui and Cai, Yadi. 2015. "Efficient Algorithms for Distance-Based Representative Skyline Computation in 2D Space ." 17th Asia-Pacific Web Conference (APWeb2015). Guangzhou, China 18 - 20 Sep 2015 Switzerland . Springer. https://doi.org/10.1007/978-3-319-25255-1_10