Transformers in Cybersecurity: Advancing Threat Detection and Response through Machine Learning Architectures
DOI:
https://doi.org/10.51903/jtie.v3i3.211Keywords:
Cybersecurity, Transformer Models, Threat Detection, Machine Learning ArchitecturesAbstract
The increasing sophistication of cyber threats has outpaced the capabilities of traditional detection and response systems, necessitating the adoption of advanced machine learning architectures. This study investigates the application of Transformer-based models in cybersecurity, focusing on their ability to enhance threat detection and response. Leveraging publicly available datasets, including CICIDS 2017 and UNSW-NB15, the research employs a systematic methodology encompassing data preprocessing, model optimization, and comparative performance evaluation. The Transformer model, tailored for cybersecurity, integrates self-attention mechanisms and positional encoding to capture complex dependencies in network traffic data. The experimental results reveal that the proposed model achieves an accuracy of 97.8%, outperforming conventional methods such as Random Forest (92.3%) and deep learning approaches like CNN (94.1%) and LSTM (95.6%). Additionally, the Transformer demonstrates high detection rates across diverse attack types, with rates exceeding 98% for Denial of Service and Brute Force attacks. Attention heatmaps provide valuable insights into feature importance, enhancing the interpretability of the model’s decisions. Scalability tests confirm the model’s ability to handle large datasets efficiently, positioning it as a robust solution for dynamic cybersecurity environments. This research contributes to the field by demonstrating the feasibility and advantages of employing Transformer architectures for complex threat detection tasks. The findings have significant implications for developing scalable, interpretable, and adaptive cybersecurity systems. Future studies should explore lightweight Transformer variants and evaluate the model in operational environments to address practical deployment challenges.
References
Ahmetoglu, H., & Das, R. (2022). A comprehensive review on detection of cyber-attacks: Data sets, methods, challenges, and future research directions. Internet of Things, 20, 100615. https://doi.org/10.1016/j.iot.2022.100615
Alshomrani, M., Albeshri, A., Alturki, B., Alallah, F. S., & Alsulami, A. A. (2024). Survey of Transformer-Based Malicious Software Detection Systems. Electronics, 13(23), 4677. https://doi.org/10.3390/electronics13234677
Alzonem, F., Albrecht, G., Castellanos, D., Vandermeer, M., & Stansfield, B. (2024). Ransomware Detection Using Convolutional Neural Networks and Isolation Forests in Network Traffic Patterns. https://doi.org/10.21203/rs.3.rs-5278706/v1
Ashawa, M., Owoh, N., Hosseinzadeh, S., & Osamor, J. (2024). Enhanced Image-Based Malware Classification Using Transformer-Based Convolutional Neural Networks (CNNs). Electronics, 13(20), 4081. https://doi.org/10.3390/electronics13204081
Bakhsh, S. A., Khan, M. A., Ahmed, F., Alshehri, M. S., Ali, H., & Ahmad, J. (2023). Enhancing IoT network security through deep learning-powered Intrusion Detection System. Internet of Things, 24, 100936. https://doi.org/10.1016/j.iot.2023.100936
Bazi, Y., Bashmal, L., Rahhal, M. M. Al, Dayil, R. Al, & Ajlan, N. Al. (2021). Vision Transformers for Remote Sensing Image Classification. Remote Sensing, 13(3), 516. https://doi.org/10.3390/rs13030516
Bengesi, S., El-Sayed, H., Sarker, M. K., Houkpati, Y., Irungu, J., & Oladunni, T. (2024). Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers. IEEE Access, 12, 69812–69837. https://doi.org/10.1109/ACCESS.2024.3397775
Brosolo, M., Puthuvath, V., KA, A., Rehiman, R., & Conti, M. (2024). SoK: Visualization-based Malware Detection Techniques. Proceedings of the 19th International Conference on Availability, Reliability and Security, 1–13. https://doi.org/10.1145/3664476.3664514
Bukhari, S. M. S., Zafar, M. H., Houran, M. A., Moosavi, S. K. R., Mansoor, M., Muaaz, M., & Sanfilippo, F. (2024). Secure and privacy-preserving intrusion detection in wireless sensor networks: Federated learning with SCNN-Bi-LSTM for enhanced reliability. Ad Hoc Networks, 155, 103407. https://doi.org/10.1016/j.adhoc.2024.103407
Canchila, S., Meneses-Eraso, C., Casanoves-Boix, J., Cortés-Pellicer, P., & Castelló-Sirvent, F. (2024). Natural language processing: An overview of models, transformers and applied practices. Computer Science and Information Systems, 21(3), 1097–1145. https://doi.org/10.2298/CSIS230217031C
Chaluvaraj Preethi, B., Vasanthi, R., Sugitha, G., & Ayshwarya Lakshmi, S. (2024). Intrusion detection and secure data storage in the cloud were recommend by a multiscale deep bidirectional gated recurrent neural network. Expert Systems with Applications, 255, 124428. https://doi.org/10.1016/j.eswa.2024.124428
Chourasia, S. R., Yadav, S. K., Sarkar, M., Sharma, P., Sharma, P., Sinha, A., Upadhyay, A. K., Kole, M., & Shukla, S. K. (2024). Cybersecurity Frameworks and Models: Review of the Existing Global Best Practices. Productivity, 65(1), 29–42. https://doi.org/10.32381/PROD.2024.65.01.4
Devendiran, R., & Turukmane, A. V. (2024). Dugat-LSTM: Deep learning based network intrusion detection system using chaotic optimization strategy. Expert Systems with Applications, 245, 123027. https://doi.org/10.1016/j.eswa.2023.123027
El-Shafeiy, E., Elsayed, W. M., Elwahsh, H., Alsabaan, M., Ibrahem, M. I., & Elhady, G. F. (2024). Deep Complex Gated Recurrent Networks-Based IoT Network Intrusion Detection Systems. Sensors, 24(18), 5933. https://doi.org/10.3390/s24185933
Li, M., Han, D., Li, D., Liu, H., & Chang, C.-C. (2022). MFVT: an anomaly traffic detection method merging feature fusion network and vision transformer architecture. EURASIP Journal on Wireless Communications and Networking, 2022(1), 39. https://doi.org/10.1186/s13638-022-02103-9
Lumazine, A., Drakos, G., Salvatore, M., Armand, V., Andros, B., Castiglione, R., & Grigorescu, E. (2024). Ransomware Detection in Network Traffic Using a Hybrid CNN and Isolation Forest Approach. https://doi.org/10.22541/au.172901014.44599790/v1
Luo, R., Sun, L., Xia, Y., Qin, T., Zhang, S., Poon, H., & Liu, T.-Y. (2022). BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6). https://doi.org/10.1093/bib/bbac409
Maurício, J., Domingues, I., & Bernardino, J. (2023). Comparing Vision Transformers and Convolutional Neural Networks for Image Classification: A Literature Review. Applied Sciences, 13(9), 5521. https://doi.org/10.3390/app13095521
Muthunambu, N. K., Prabakaran, S., Kavin, B. P., Siruvangur, K. S., Chinnadurai, K., & Ali, J. (2024). A Novel Eccentric Intrusion Detection Model Based on Recurrent Neural Networks with Leveraging LSTM. Computers, Materials & Continua, 78(3), 3089–3127. https://doi.org/10.32604/cmc.2023.043172
Pawlicki, M., Choraś, M., Kozik, R., & Hołubowicz, W. (2020). On the Impact of Network Data Balancing in Cybersecurity Applications (pp. 196–210). https://doi.org/10.1007/978-3-030-50423-6_15
Ren, Y., Xiao, Y., Zhou, Y., Zhang, Z., & Tian, Z. (2022). CSKG4APT: A Cybersecurity Knowledge Graph for Advanced Persistent Threat Organization Attribution. IEEE Transactions on Knowledge and Data Engineering, 1–15. https://doi.org/10.1109/TKDE.2022.3175719
Sah, A. K., & K, V. (2024). Anomaly-Based Intrusion Detection in Network Traffic using Machine Learning: A Comparative Study of Decision Trees and Random Forests. 2024 2nd International Conference on Networking and Communications (ICNWC), 1–7. https://doi.org/10.1109/ICNWC60771.2024.10537451
Sathishkumar, P., Gnanabaskaran, A., Saradha, M., & Gopinath, R. (2024). Dos attack detection using fuzzy temporal deep long Short-Term memory algorithm in wireless sensor network. Ain Shams Engineering Journal, 15(12), 103052. https://doi.org/10.1016/j.asej.2024.103052
Song, B., Wu, Y., & Xu, Y. (2024). ViTCN: Vision Transformer Contrastive Network for Reasoning. 2024 5th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), 452–456. https://doi.org/10.1109/AINIT61980.2024.10581446
Sontan Adewale Daniel, & Samuel Segun Victor. (2024). EMERGING TRENDS IN CYBERSECURITY FOR CRITICAL INFRASTRUCTURE PROTECTION: A COMPREHENSIVE REVIEW. Computer Science & IT Research Journal, 5(3), 576–593. https://doi.org/10.51594/csitrj.v5i3.872
Talukder, Md. A., Islam, Md. M., Uddin, M. A., Hasan, K. F., Sharmin, S., Alyami, S. A., & Moni, M. A. (2024). Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction. Journal of Big Data, 11(1), 33. https://doi.org/10.1186/s40537-024-00886-w
Ullah, F., Ullah, S., Srivastava, G., & Lin, J. C.-W. (2024). IDS-INT: Intrusion detection system using transformer-based transfer learning for imbalanced network traffic. Digital Communications and Networks, 10(1), 190–204. https://doi.org/10.1016/j.dcan.2023.03.008
Zhang, H., & Shafiq, M. O. (2024). Survey of transformers and towards ensemble learning using transformers for natural language processing. Journal of Big Data, 11(1), 25. https://doi.org/10.1186/s40537-023-00842-0
Zhang, J., Liu, S., & Liu, Z. (2024). Research on APT Malware Detection Based on BERT-Transformer-TextCNN Modeling. Proceedings of the 2024 International Conference on Generative Artificial Intelligence and Information Security, 235–242. https://doi.org/10.1145/3665348.3665389
Zhang, Z., Gong, S., Liu, Z., & Chen, D. (2023). A novel hybrid framework based on temporal convolution network and transformer for network traffic prediction. PLOS ONE, 18(9), e0288935. https://doi.org/10.1371/journal.pone.0288935
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Journal of Technology Informatics and Engineering
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.