Murat UÇAR, Emine UÇAR

DERİN OTOMATİK KODLAYICI TABANLI ÖZELLİK ÇIKARIMI İLE ANDROİD KÖTÜCÜL YAZILIM UYGULAMALARININ TESPİTİ

Günümüzde akıllı telefonlar insan hayatının vazgeçilmez bir parçası haline gelmiştir. Android işletim sistemi bu cihazlar arasında en yüksek kullanım oranına sahiptir. Gelişmiş özellikleri sayesinde kullanıcıların fotoğrafları, sağlık verileri, kimlik bilgileri ve banka bilgileri gibi kişisel bilgilerini saklamalarını sağlar. Yaygın kullanımı ve gelişmiş özellikleri nedeniyle kötü amaçlı yazılım geliştiricileri tarafından en çok hedeflenen işletim sistemidir. Bu çalışmada Android kötücül yazılım uygulamalarının tespitinde başarıyı artırmak için öncelikle derin oto kodlayıcı mimarisi kullanılarak özellik çıkarımı yapılmıştır. Bir sonraki aşamada ise makine öğrenmesi yöntemlerinden Rasgele Orman (RO), K-En Yakın Komşu (K-EYK) ve Karar Ağacı (KA) algoritmaları kullanılarak sınıflandırma yapılmıştır. Deneysel sonuçlar derin oto kodlayıcı ve temel bileşen analizi kullanarak özellik çıkarımının başarıyı artırdığını göstermiştir. Yapılan analizlere göre, Rastgele Orman algoritmasının % 94,40 ile en iyi doğruluğa sahip olduğu görülmüştür.

Anahtar Kelimeler:

Güvenlik, kötü amaçlı yazılım algılama, makine öğrenmesi, oto kodlayıcı

THE DETECTION OF ANDROID MALWARE APPLICATIONS WITH DEEP AUTOENCODER BASED FEATURE EXTRACTION

Nowadays, smart phones have become an indispensable part of human life. The Android operating system has the highest utilization rate among these devices. Its advanced features allow users to store personal information such as photos, health data, identity information, and bank information. It is the most targeted operating system by malware developers because of its widespread usage and advanced features. In this study, in order to increase the success in the detection of Android malware applications, feature extraction was performed by using deep autoencoders. In the next step, data are classified by Random Forest (RF), K-Nearest Neighbor (K-NN) algorithm and Decision Tree (DT) algorithms. Experimental results showed that feature extraction using deep autoencoders and principal component analysis increased the accuracy of model. According to the analyses made, it has been observed that Random Forest algorithm had the best accuracy with 94.40%.

Keywords:

Security, malware detection, machine learning, auto encoder,

PDF

___

Albright, S. C., Winston, W. L., & Zappe, C. (2006). Data Analysis & Decision Making, Üçüncü Baskı, Australia: Thomson South-Western.
Alshahrani, H., Mansourt, H., Thorn, S., Alshehri, A., Alzahrani, A., & Fu, H. (2018). DDefender: Android application threat detection using static and dynamic analysis. 2018 IEEE International Conference on Consumer Electronics (ICCE), 1-6.
Archer, K.J., & Kimes, R.V. (2008). Empirical characterization of random forest variable importance measures. Computational Statistics & Data Analysis, 52, 2249-2260.
Baldi, P. (2011). Autoencoders, Unsupervised Learning and Deep Architectures. In Proceedings of the 2011 International Conference on Unsupervised and Transfer Learning Workshop - Volume 27 (pp. 37–50). JMLR.org. Retrieved from http://dl.acm.org/citation.cfm?id=3045796.3045801
Bengio, Y. (2009). Learning Deep Architectures for AI. Found. Trends Mach. Learn., 2(1), 1–127. https://doi.org/10.1561/2200000006
Breiman L., (2001). Random forests, machine learning, 2001 Kluwer Academic Publishers, 45(1), 5-32.
CICAAGM Veri Seti, (2017). https://www.unb.ca/cic/datasets/android-adware.html. [Erişim Tarihi: 06.05.2019].
He, N., Wang, T., Chen, P., Yan, H., & Jin, Z. (2018). An Android Malware Detection Method Based on Deep AutoEncoder. AICCC.
International Data Corporation Media Center Raporu, https://www.idc.com/promo/smartphone-market-share/os [Erişim Tarihi: 11.06.2019].
John, T.S., Thomas, T., & Uddin, M.M. (2017). A Multifamily Android Malware Detection Using Deep Autoencoder Based Feature Extraction. In 2017 Ninth International Conference on Advanced Computing (ICoAC).
Lashkari, A.H., Kadir, A.F., González, H.G., Mbah, K.F., & Ghorbani, A.A. (2017). Towards a Network-Based Framework for Android Malware Detection and Characterization. 2017 15th Annual Conference on Privacy, Security and Trust (PST), 233-23309.
Li, W., Wang, Z., Cai, J., & Cheng, S. (2018). An Android Malware Detection Approach Using Weight-Adjusted Deep Learning. 2018 International Conference on Computing, Networking and Communications (ICNC), 437-441.
Naway, A., & Li, Y. (2019). Android Malware Detection Using Autoencoder. ArXiv, abs/1901.07315.
Tatlıdil, H. (1996). Uygulamalı Çok Değişkenli İstatiksel Analiz Ankara: Cem Web Ofset Ltd.
Vinayakumar, R., Soman, K.P., Poornachandran, P., & Kumar, S.S. (2018). Detecting Android malware using Long Short-term Memory (LSTM). Journal of Intelligent and Fuzzy Systems, 34, 1277-1288.
Wang, W., Zhao, M., &Wang, J. (2018). Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-018-0803-6
Xu, K., Li, Y., Deng, R.H., & Chen, K. (2018). DeepRefiner: Multi-layer Android Malware Detection System Applying Deep Neural Networks. 2018 IEEE European Symposium on Security and Privacy (EuroS&P), 473-487.
Yousefi-Azar, M., Varadharajan, V., Hamey, L., & Tupakula, U.K. (2017). Autoencoder-based feature learning for cyber security applications. 2017 International Joint Conference on Neural Networks (IJCNN), 3854-3861.
Yuan, Z., Lu, Y., & Xue, Y. (2016). Droiddetector: android malware characterization and detection using deep learning. Tsinghua Science and Technology, 21(1), 114–123. doi:10.1109/tst.2016.7399288
Yuxin, D., & Siyi, Z. (2017). Malware detection based on deep learning algorithm. Neural Computing and Applications, 31, 461-472.
Zhou, Q., Feng, F., Shen, Z., Zhou, R., Hsieh, M., & Li, K. (2018). A novel approach for mobile malware classification and detection in Android systems. Multimedia Tools and Applications, 78, 3529-3552.
Zhu, D., Jin, H., Yang, Y., Wu, D., & Chen, W. (2017). DeepFlow: Deep learning-based malware detection by mining Android application for abnormal usage of sensitive data. 2017 IEEE Symposium on Computers and Communications (ISCC), 438-443.