Two-layer detection framework with a high accuracy and efficiency for a malware family over the TLS protocol

Rongfeng Zheng; Jiayong Liu; Liang Liu; Shan Liao; Kai Li; Jihong Wei; Li Li; Zhiyi Tian

doi:10.1371/journal.pone.0232696

Two-layer detection framework with a high accuracy and efficiency for a malware family over the TLS protocol

PLoS One. 2020 May 6;15(5):e0232696. doi: 10.1371/journal.pone.0232696. eCollection 2020.

Authors

Rongfeng Zheng¹, Jiayong Liu², Liang Liu², Shan Liao², Kai Li², Jihong Wei², Li Li², Zhiyi Tian²

Affiliations

¹ College of Electronics and Information Engineering, Sichuan University, Chengdu, China.
² College of Cybersecurity, Sichuan University, Chengdu, China.

Abstract

The transport layer security (TLS) protocol is widely adopted by apps as well as malware. With the geometric growth of TLS traffic, accurate and efficient detection of malicious TLS flows is becoming an imperative. However, current studies focus on either detection accuracy or detection efficiency, and few studies take into account both indicators. In this paper, we propose a two-layer detection framework composed of a filtering model (FM) and a malware family classification model (MFCM). In the first layer, a new set of TLS handshake features is presented to train the FM, which is devised to filter out a majority of benign TLS flows. For identifying malware families, both TLS handshake features and statistical features are applied to construct the MFCM in the second layer. Comprehensive experiments are conducted to substantiate the high accuracy and efficiency of the proposed two-layer framework. A total of 96.32% of benign TLS flows can be filtered out by the FM with few malicious TLS flows being discarded provided the threshold of the FM is set to 0.01. Moreover, a multiclassifier is selected to construct the MFCM to provide better performance than a set of binary classifiers under the same feature set. In addition, when the ratio of benign and malicious TLS flows is set to 10:1, the detection efficiency of the two-layer framework is 188% faster than that of the single-layer framework, while the average detection accuracy reaches 99.45%.

MeSH terms

Algorithms
Computer Security*
Data Collection
Mobile Applications
Software

Grants and funding

The author(s) received no specific funding for this work.