10th International Conference on Natural and Engineering Sciences, 20 - 21 Aralık 2025, ss.3-4, (Özet Bildiri)
Cyberattacks
have become one of the most critical problems in the field of information
security today, with the acceleration of digitalization. Malware, in
particular, can overcome traditional detection methods thanks to advanced
concealment, packaging, and obfuscation techniques, posing serious security
threats. This situation has increased the need for deep learning-based methods
with more powerful representation learning capabilities than classical machine
learning approaches.
This study
proposes a platform-independent AI-based malware detection system capable of
identifying whether files and applications running on different operating
system platforms contain malware. The study is based on the BIG 2015 (Binary
Intelligence Group) malware dataset published by Microsoft; in addition, mobile
traffic data generated on portable devices is included in the analysis process.
File content, metadata, and assembly code are combined with runtime system
behaviors, and a hybrid detection mechanism is designed using both static and
dynamic analysis.
In
particular, the frequency of use of commands in assembly code is extracted as a
feature, and this data is classified using Transformer-based deep learning
models. Experimental studies utilized BERT-based models; the model was trained
both with and without k-fold cross-validation. The results showed an accuracy
rate of over 99% on the training data, validation data, and independent test
data. Despite the relatively small dataset, the model did not memorize and
exhibited strong generalization capabilities.
In
conclusion, this study demonstrates that Transformer-based language models
offer high accuracy, strong generalization, and platform-independent analysis
capabilities in malware detection; showing that advanced deep learning
approaches are an effective solution for modern cybersecurity systems.