MQTTset, a New Dataset for Machine Learning Techniques on MQTT

Sensors (Basel). 2020 Nov 18;20(22):6578. doi: 10.3390/s20226578.

Abstract

IoT networks are increasingly popular nowadays to monitor critical environments of different nature, significantly increasing the amount of data exchanged. Due to the huge number of connected IoT devices, security of such networks and devices is therefore a critical issue. Detection systems assume a crucial role in the cyber-security field: based on innovative algorithms such as machine learning, they are able to identify or predict cyber-attacks, hence to protect the underlying system. Nevertheless, specific datasets are required to train detection models. In this work we present MQTTset, a dataset focused on the MQTT protocol, widely adopted in IoT networks. We present the creation of the dataset, also validating it through the definition of a hypothetical detection system, by combining the legitimate dataset with cyber-attacks against the MQTT network. Obtained results demonstrate how MQTTset can be used to train machine learning models to implement detection systems able to protect IoT contexts.

Keywords: Internet of Things; MQTT; artificial intelligence; dataset; detection system; machine learning.