HAT-D: Lightweight Embedding-Space Adversarial Training with a Compact Denoiser for Robust Sentiment Analysis

Wonder Kudzo Ekpe

doi:https://www.doi.org/10.59256/ijsreat.20260602022

ARCHIVES

Original Article

HAT-D: Lightweight Embedding-Space Adversarial Training with a Compact Denoiser for Robust Sentiment Analysis

Wonder Kudzo Ekpe¹

South Africa

Published Online: March-April 2026

Pages: 153-164

Cite this article

↗ https://www.doi.org/10.59256/ijsreat.20260602022

References

1. Alzantot, M., Sharma, Y., Chakraborty, S., Zhang, H., Hsieh, C.-J., and Srivastava, M. B. (2019). Genattack: practical black-box attacks with
gradient-free optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’19, page 1111–1119, New York,
NY, USA. Association for Computing Machinery.
2. Belinkov, Y. and Bisk, Y. (2018). Synthetic and natural noise both break neural machine translation. In
3. International Conference on Learning Representations.
4. Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer, New York.
5. Carlini, N. and Wagner, D. (2017). Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the
10th ACM Workshop on Artificial Intelligence and Security, AISec ’17, page 3–14, New York, NY, USA. Association for Computing
Machinery.
6. Croce, F. and Hein, M. (2020). Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In
International Conference on Machine Learning (ICML).
7. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language
understanding. In NAACL-HLT 2019, pages 4171–4186.
8. Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation,
10(7):1895–1923.
9. Ebrahimi, J., Rao, A., Lowd, D., and Dou, D. (2018). Hotflip: White-box adversarial examples for text classification. In Proceedings of the
56th Annual Meeting of the Association for Computational Linguistics (ACL), pages 31–36.
10. Goodfellow, I. J., Shlens, J., and Szegedy, C. (2015). Explaining and harnessing adversarial examples. In
11. International Conference on Learning Representations (ICLR).
12. Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A., Attariyan, M., and Gelly, S. (2019). Parameter-
efficient transfer learning for nlp. In International Conference on Machine Learning (ICML).
13. Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). Lora: Low-rank adaptation of large language
models.
14. Jia, R., Raghunathan, A., Go¨ ksel, K., and Liang, P. (2019). Certified robustness to adversarial word substitutions. In Proceedings of the2019 Conference on Empirical Methods in Natural Language Processing (EMNLP-IJCNLP).
15. Jin, D., Jin, Z., Zhou, J. T., and Szolovits, P. (2020). Is BERT really robust? A strong baseline for natural language attack on text
classification and entailment. arXiv.
16. Jones, E., Dragan, A., and Adar, E. (2020). Robust encodings: A framework for combating adversarial typos.
17. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL).
18. Lester, B., Al-Rfou, R., and Constant, N. (2021). The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021
Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3045–3059, Online and Punta Cana, Dominican Republic.
Association for Computational Linguistics.
19. Li, J., Ji, S., Du, T., Li, B., and Wang, T. (2019). Textbugger: Generating adversarial text against real-world applications. In Proceedings of
the 26th Annual Network and Distributed System Security Symposium (NDSS).
20. Li, L., Ma, R., Guo, Q., Xue, X., and Qiu, X. (2020). Bert-attack: Adversarial attack against bert using bert. In Proceedings of the 2020
Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6193–6202.
21. Li, X. L. and Liang, P. (2021). Prefix-tuning: Optimizing continuous prompts for generation. In Zong, C., Xia, F., Li, W., and Navigli, R.,
editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference
on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597, Online. Association for Computational Linguistics.
22. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In
International Conference on Learning Representations (ICLR).
23. Miller, G. A. (1995). Wordnet: A lexical database for english. Communications of the ACM, 38(11):39–41.
24. Miyato, T., Dai, A. M., and Goodfellow, I. (2017). Adversarial training methods for semi-supervised text classification. In ICLR.
25. Mozes, M. A. J. (2024). Understanding and Guarding against Natural Language Adversarial Examples.
26. Doctor of philosophy, University College London. Department of Security and Crime Science.
27. Pang, T., Xu, K., Du, C., Chen, N., and Zhu, J. (2019). Improving adversarial robustness via promoting ensemble diversity. In International
Conference on Machine Learning (ICML).
28. Papernot, N., McDaniel, P., Goodfellow, I., Jansen, S., Celik, Z. B., and Swami, A. (2016). Practical black-box attacks against deep learning
systems using adversarial examples. arXiv preprint.
29. Pennington, J., Socher, R., and Manning, C. D. (2014). Glove: Global vectors for word representation. In
30. Conference on Empirical Methods in Natural Language Processing.
31. Pfeiffer, J., Ru¨ ckle´ , A., Poth, C., Kamath, A., Vulic´ , I., Ruder, S., Cho, K., and Gurevych, I. (2020). Adapterhub: A framework for
adapting transformers. In Conference on Empirical Methods in Natural Language Processing (EMNLP).
32. Ryu, G. and Choi, D. (2022). A hybrid adversarial training for deep learning model and denoising network resistant to adversarial examples.
Applied Intelligence, 53:9174–9187.
33. Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. In
NeurIPS Workshop on Efficient Deep Learning for Computer Vision.
34. Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A., and Potts, C. (2013). Recursive deep models for semantic
compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing,
pages 1631–1642, Seattle, Washington, USA. Association for Computational Linguistics.
35. Su, D., Zhang, H., Chen, H., Yi, J., Chen, P.-Y., and Gao, Y. (2019). Is robustness the cost of accuracy? – a comprehensive study on the
robustness of 18 deep image classification models.
36. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2014). Intriguing properties of neural networks. In
International Conference on Learning Representations (ICLR).
37. Trame` r, F. and Boneh, D. (2019). Adversarial training and robustness for multiple perturbations. https:
38. //arxiv.org/abs/1904.13000.
39. Trame` r, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., and McDaniel, P. (2017). Ensemble adversarial training: Attacks and
defenses.
40. Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., and Madry, A. (2018). Robustness may be at odds with accuracy. arXiv: Machine
Learning.
41. Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In
Proceedings of the 25th International Conference on Machine Learning, ICML ’08, page 1096–1103, New York, NY, USA. Association for
Computing Machinery.
42. Wang, W., Wang, R., Wang, D., Xue, X., and Qi, G. (2021). Towards robustness of deep learning models against adversarial attacks: A
survey. Frontiers of Computer Science, 15(4):1–22.
43. Xu, H., Ma, Y., Liu, H., Deb, D., Liu, H., Tang, J., and Jain, A. K. (2020). Adversarial attacks and defenses in images, graphs and text: A review.
International Journal of Automation and Computing, 17(2):151–178.
44. Zhang, H., Yu, Y., Jiao, J., Xing, E., Ghaoui, L. E., and Jordan, M. (2020a). Theoretically principled trade-off between robustness and accuracy.
In Proceedings of the 36th International Conference on Machine Learning (ICML).
45. Zhang, W. E., Sheng, Q. Z., Alhazmi, A., and Li, C. (2020b). Adversarial attacks on deep-learning models in natural language processing: A
survey. ACM Transactions on Intelligent Systems and Technology (TIST), 11(3):1–41.
46. Zhao, Z., Dua, D., and Singh, S. (2018). Generating natural adversarial examples.
47. Zhou, Y., Jiang, J.-Y., and Chang, K.-W. (2019). Learning to discriminate perturbations for blocking adversarial attacks in text. In
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP-IJCNLP).

Quick Links

Download

Manuscript Template Copyright Form

Policies

Share Article

X

Facebook

Or copy link

https://test.ijsreat.com/archives/10.59256/ijsreat.20260602022

*Instagram doesn't support direct link sharing from web. Copy the link and share it in your Instagram story or post.

ARCHIVES

HAT-D: Lightweight Embedding-Space Adversarial Training with a Compact Denoiser for Robust Sentiment Analysis

Cite this article

References

Related Articles

Fake Currency Detection Using Deep Learning

Smart E-Commerce System with Dynamic Pricing

Personal Expense Tracker with Currency Converter

Paw Safe: An Extensive Technology-Driven Framework for Stray Dog Rescue, Healthcare Management, Community Engagement, and Smart Urban Governance

Design and Development of a Full-Stack E-Commerce Website

Power quality improvement techniques from a topological perspective: An overview

PlumX Metrics

Dimension

Quick Links

Download

Policies

Share Article