Transformers in Cybersecurity: Advancing Threat Detection and Response through Machine Learning Architectures

Budi  Hartono; Fujiama Diapoldo  Silalahi; Moh  Muthohir

doi:10.51903/jtie.v3i3.211

Authors

Budi Hartono Universitas Sains dan Teknologi Komputer, Semarang, 50192
Fujiama Diapoldo Silalahi Universitas Sains dan Teknologi Komputer, Semarang, 50192
Moh Muthohir Universitas Sains dan Teknologi Komputer, Semarang, 50192

DOI:

https://doi.org/10.51903/jtie.v3i3.211

Keywords:

Cybersecurity, Transformer Models, Threat Detection, Machine Learning Architectures

Abstract

The increasing sophistication of cyber threats has outpaced the capabilities of traditional detection and response systems, necessitating the adoption of advanced machine learning architectures. This study investigates the application of Transformer-based models in cybersecurity, focusing on their ability to enhance threat detection and response. Leveraging publicly available datasets, including CICIDS 2017 and UNSW-NB15, the research employs a systematic methodology encompassing data preprocessing, model optimization, and comparative performance evaluation. The Transformer model, tailored for cybersecurity, integrates self-attention mechanisms and positional encoding to capture complex dependencies in network traffic data. The experimental results reveal that the proposed model achieves an accuracy of 97.8%, outperforming conventional methods such as Random Forest (92.3%) and deep learning approaches like CNN (94.1%) and LSTM (95.6%). Additionally, the Transformer demonstrates high detection rates across diverse attack types, with rates exceeding 98% for Denial of Service and Brute Force attacks. Attention heatmaps provide valuable insights into feature importance, enhancing the interpretability of the model’s decisions. Scalability tests confirm the model’s ability to handle large datasets efficiently, positioning it as a robust solution for dynamic cybersecurity environments. This research contributes to the field by demonstrating the feasibility and advantages of employing Transformer architectures for complex threat detection tasks. The findings have significant implications for developing scalable, interpretable, and adaptive cybersecurity systems. Future studies should explore lightweight Transformer variants and evaluate the model in operational environments to address practical deployment challenges.

References

Ahmetoglu, H., & Das, R. (2022). A comprehensive review on detection of cyber-attacks: Data sets, methods, challenges, and future research directions. Internet of Things, 20, 100615. https://doi.org/10.1016/j.iot.2022.100615

Alshomrani, M., Albeshri, A., Alturki, B., Alallah, F. S., & Alsulami, A. A. (2024). Survey of Transformer-Based Malicious Software Detection Systems. Electronics, 13(23), 4677. https://doi.org/10.3390/electronics13234677

Alzonem, F., Albrecht, G., Castellanos, D., Vandermeer, M., & Stansfield, B. (2024). Ransomware Detection Using Convolutional Neural Networks and Isolation Forests in Network Traffic Patterns. https://doi.org/10.21203/rs.3.rs-5278706/v1

Ashawa, M., Owoh, N., Hosseinzadeh, S., & Osamor, J. (2024). Enhanced Image-Based Malware Classification Using Transformer-Based Convolutional Neural Networks (CNNs). Electronics, 13(20), 4081. https://doi.org/10.3390/electronics13204081

Bakhsh, S. A., Khan, M. A., Ahmed, F., Alshehri, M. S., Ali, H., & Ahmad, J. (2023). Enhancing IoT network security through deep learning-powered Intrusion Detection System. Internet of Things, 24, 100936. https://doi.org/10.1016/j.iot.2023.100936

Bazi, Y., Bashmal, L., Rahhal, M. M. Al, Dayil, R. Al, & Ajlan, N. Al. (2021). Vision Transformers for Remote Sensing Image Classification. Remote Sensing, 13(3), 516. https://doi.org/10.3390/rs13030516

Bengesi, S., El-Sayed, H., Sarker, M. K., Houkpati, Y., Irungu, J., & Oladunni, T. (2024). Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers. IEEE Access, 12, 69812–69837. https://doi.org/10.1109/ACCESS.2024.3397775

Brosolo, M., Puthuvath, V., KA, A., Rehiman, R., & Conti, M. (2024). SoK: Visualization-based Malware Detection Techniques. Proceedings of the 19th International Conference on Availability, Reliability and Security, 1–13. https://doi.org/10.1145/3664476.3664514

Bukhari, S. M. S., Zafar, M. H., Houran, M. A., Moosavi, S. K. R., Mansoor, M., Muaaz, M., & Sanfilippo, F. (2024). Secure and privacy-preserving intrusion detection in wireless sensor networks: Federated learning with SCNN-Bi-LSTM for enhanced reliability. Ad Hoc Networks, 155, 103407. https://doi.org/10.1016/j.adhoc.2024.103407

Canchila, S., Meneses-Eraso, C., Casanoves-Boix, J., Cortés-Pellicer, P., & Castelló-Sirvent, F. (2024). Natural language processing: An overview of models, transformers and applied practices. Computer Science and Information Systems, 21(3), 1097–1145. https://doi.org/10.2298/CSIS230217031C

Chaluvaraj Preethi, B., Vasanthi, R., Sugitha, G., & Ayshwarya Lakshmi, S. (2024). Intrusion detection and secure data storage in the cloud were recommend by a multiscale deep bidirectional gated recurrent neural network. Expert Systems with Applications, 255, 124428. https://doi.org/10.1016/j.eswa.2024.124428

Chourasia, S. R., Yadav, S. K., Sarkar, M., Sharma, P., Sharma, P., Sinha, A., Upadhyay, A. K., Kole, M., & Shukla, S. K. (2024). Cybersecurity Frameworks and Models: Review of the Existing Global Best Practices. Productivity, 65(1), 29–42. https://doi.org/10.32381/PROD.2024.65.01.4

Devendiran, R., & Turukmane, A. V. (2024). Dugat-LSTM: Deep learning based network intrusion detection system using chaotic optimization strategy. Expert Systems with Applications, 245, 123027. https://doi.org/10.1016/j.eswa.2023.123027

El-Shafeiy, E., Elsayed, W. M., Elwahsh, H., Alsabaan, M., Ibrahem, M. I., & Elhady, G. F. (2024). Deep Complex Gated Recurrent Networks-Based IoT Network Intrusion Detection Systems. Sensors, 24(18), 5933. https://doi.org/10.3390/s24185933

Li, M., Han, D., Li, D., Liu, H., & Chang, C.-C. (2022). MFVT: an anomaly traffic detection method merging feature fusion network and vision transformer architecture. EURASIP Journal on Wireless Communications and Networking, 2022(1), 39. https://doi.org/10.1186/s13638-022-02103-9

Lumazine, A., Drakos, G., Salvatore, M., Armand, V., Andros, B., Castiglione, R., & Grigorescu, E. (2024). Ransomware Detection in Network Traffic Using a Hybrid CNN and Isolation Forest Approach. https://doi.org/10.22541/au.172901014.44599790/v1

Luo, R., Sun, L., Xia, Y., Qin, T., Zhang, S., Poon, H., & Liu, T.-Y. (2022). BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6). https://doi.org/10.1093/bib/bbac409

Maurício, J., Domingues, I., & Bernardino, J. (2023). Comparing Vision Transformers and Convolutional Neural Networks for Image Classification: A Literature Review. Applied Sciences, 13(9), 5521. https://doi.org/10.3390/app13095521

Muthunambu, N. K., Prabakaran, S., Kavin, B. P., Siruvangur, K. S., Chinnadurai, K., & Ali, J. (2024). A Novel Eccentric Intrusion Detection Model Based on Recurrent Neural Networks with Leveraging LSTM. Computers, Materials & Continua, 78(3), 3089–3127. https://doi.org/10.32604/cmc.2023.043172

Pawlicki, M., Choraś, M., Kozik, R., & Hołubowicz, W. (2020). On the Impact of Network Data Balancing in Cybersecurity Applications (pp. 196–210). https://doi.org/10.1007/978-3-030-50423-6_15

Ren, Y., Xiao, Y., Zhou, Y., Zhang, Z., & Tian, Z. (2022). CSKG4APT: A Cybersecurity Knowledge Graph for Advanced Persistent Threat Organization Attribution. IEEE Transactions on Knowledge and Data Engineering, 1–15. https://doi.org/10.1109/TKDE.2022.3175719

Sah, A. K., & K, V. (2024). Anomaly-Based Intrusion Detection in Network Traffic using Machine Learning: A Comparative Study of Decision Trees and Random Forests. 2024 2nd International Conference on Networking and Communications (ICNWC), 1–7. https://doi.org/10.1109/ICNWC60771.2024.10537451

Sathishkumar, P., Gnanabaskaran, A., Saradha, M., & Gopinath, R. (2024). Dos attack detection using fuzzy temporal deep long Short-Term memory algorithm in wireless sensor network. Ain Shams Engineering Journal, 15(12), 103052. https://doi.org/10.1016/j.asej.2024.103052

Song, B., Wu, Y., & Xu, Y. (2024). ViTCN: Vision Transformer Contrastive Network for Reasoning. 2024 5th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), 452–456. https://doi.org/10.1109/AINIT61980.2024.10581446

Sontan Adewale Daniel, & Samuel Segun Victor. (2024). EMERGING TRENDS IN CYBERSECURITY FOR CRITICAL INFRASTRUCTURE PROTECTION: A COMPREHENSIVE REVIEW. Computer Science & IT Research Journal, 5(3), 576–593. https://doi.org/10.51594/csitrj.v5i3.872

Talukder, Md. A., Islam, Md. M., Uddin, M. A., Hasan, K. F., Sharmin, S., Alyami, S. A., & Moni, M. A. (2024). Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction. Journal of Big Data, 11(1), 33. https://doi.org/10.1186/s40537-024-00886-w

Ullah, F., Ullah, S., Srivastava, G., & Lin, J. C.-W. (2024). IDS-INT: Intrusion detection system using transformer-based transfer learning for imbalanced network traffic. Digital Communications and Networks, 10(1), 190–204. https://doi.org/10.1016/j.dcan.2023.03.008

Zhang, H., & Shafiq, M. O. (2024). Survey of transformers and towards ensemble learning using transformers for natural language processing. Journal of Big Data, 11(1), 25. https://doi.org/10.1186/s40537-023-00842-0

Zhang, J., Liu, S., & Liu, Z. (2024). Research on APT Malware Detection Based on BERT-Transformer-TextCNN Modeling. Proceedings of the 2024 International Conference on Generative Artificial Intelligence and Information Security, 235–242. https://doi.org/10.1145/3665348.3665389

Zhang, Z., Gong, S., Liu, Z., & Chen, D. (2023). A novel hybrid framework based on temporal convolution network and transformer for network traffic prediction. PLOS ONE, 18(9), e0288935. https://doi.org/10.1371/journal.pone.0288935