Attention-Based Automated Feature Extraction for Malware Analysis

Sensors (Basel). 2020 May 20;20(10):2893. doi: 10.3390/s20102893.

Abstract

Every day, hundreds of thousands of malicious files are created to exploit zero-day vulnerabilities. Existing pattern-based antivirus solutions face difficulties in coping with such a large number of new malicious files. To solve this problem, artificial intelligence (AI)-based malicious file detection methods have been proposed. However, even if we can detect malicious files with high accuracy using deep learning, it is difficult to identify why files are malicious. In this study, we propose a malicious file feature extraction method based on attention mechanism. First, by adapting the attention mechanism, we can identify application program interface (API) system calls that are more important than others for determining whether a file is malicious. Second, we confirm that this approach yields an accuracy that is approximately 12% and 5% higher than a conventional AI-based detection model using convolutional neural networks and skip-connected long short-term memory-based detection model, respectively.

Keywords: attention; deep learning; malware analysis.