SigFox IoT Dataset

>> Download dataset (~1M)

Sigfox IoT Dataset

The Sigfox IoT Dataset is a sample dataset with the communication activity recorded from a the real Internet-of-Things (IoT) network deployed by Sigfox. It can be used for anomaly detection in communication networks and other related tasks.


Contents

  1. Description
  2. How to use and cite this dataset
  3. License
  4. Acknowledgements
  5. References

1. Description

Monitoring the activity in communication networks has become a popular area of research and particular attention has been paid to detection tasks such as spotting events or anomalies. The interest in such tasks may relate to applications for security, maintenance, monitoring, etc. Especially in IoT networks, the automatic detection of entities that present anomalous activity is crucial due to the vast size these networks can have, which makes infeasible their regular monitoring by technicians or experts.

Here we provide a sample dataset with the communication activity recorded from a the real Internet-of-Things (IoT) network deployed by Sigfox at Toulouse, France. It can be used for anomaly detection in communication networks.

The recorded activity concerns a small number of base stations (BS, i.e. antennas) over a period of 5 months (January – May 2017). The data are organized in different .csv files under the name format ‘2017-##-##’. For the months February to May there are separate files with the activity of each day, while for January there is a single file for all days (we used that month as a training period). In each file, the columns correspond to BSs and the lines to communication events. Each such event is created by a broadcasting device that sends a message, which is then received by a number of nearby BSs.

The first column of each file corresponds to the abnormal BS, which is a BS whose activity has been characterized by experts as anomalous (anomaly found between ‘2017-03-01’ and ‘2017-05-30’).

Additional details for the dataset are provided in [1].

The data were recorded by Sigfox engineers and prepared for general use by B. Le Bars and A. Kalogeratos.

Below, you can see the approximate positions of the included Sigfox antennas on the map of Toulouse, France.

2. How to use and cite this dataset

This dataset has been used in the experiments of the following paper:

Batiste Le Bars and Argyris Kalogeratos, “A Probabilistic Framework to Node-level Anomaly Detection in Communication Networks“, IEEE International Conference on Computer Communications 2019 (INFOCOM), 29 Apr – 2 May 2019, Paris, France. [pdf][bib]

Although in the experiments of the above all 38 BSs were used, due to a typo the original paper mentions that the used BSs were only 34. That has been corrected in the recompiled version in the provided link.

The dataset is publicly available for research purposes under the terms of its license (see Sec. 3) and provided that its source is properly mentioned by citing directly the above paper. Optionally, this present web-page can be mentioned, too.

Copyright (C) 2018-2019, Batiste Le Bars and Argyris Kalogeratos. License details can be found in the respective section at the end of this document.

3. License

Copyright (C) 2018-2019, Batiste Le Bars and Argyris Kalogeratos. The Sigfox IoT Dataset is free dataset: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

The Sigfox IoT Dataset is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this software in the file LICENSE.txt. If not, see here.

Brief overview of the GNU GPL:

  • Provides copyright protection: True
  • Can be used in commercial applications: True
  • Bug fixes / extensions must be released to the public domain: True
  • Provides an explicit patent license: False
  • Can be used in proprietary (closed source) applications: False
  • Is a viral license: True

Other resources for the license:

4. Acknowledgement

Special thanks to Olivier Isson who provided us with the raw activity recording and indicated BSs’ operation anomalies therein. Also we would like to thank Sigfox for supporting this research and allowing this dataset to become publicly available.

5. References

[1] Batiste Le Bars and Argyris Kalogeratos, “A Probabilistic Framework to Node-level Anomaly Detection in Communication Networks“, IEEE International Conference on Computer Communications (INFOCOM), 29 Apr – 2 May 2019, Paris, France. [pdf][bib]

 

[2] The Sigfox IoT Dataset. Available online at: https://kalogeratos.com/psite/material/the-sigfox-iot-dataset/.