## Sigfox IoT Dataset ##

The *Sigfox IoT Dataset* is a sample dataset with the communication activity recorded from a the real Internet-of-Things (IoT) network deployed by Sigfox (<https://www.sigfox.com/>). It can be used for anomaly detection in communication networks and other related tasks.

----------

### Contents ###

1. Description
2. How to use and cite this dataset
3. License
4. Acknowledgements
5. References

----------

### 1. Description ###

Monitoring the activity in communication networks has become a popular area of research and particular attention has been paid to detection tasks such as spotting events or anomalies. The interest in such tasks may relate to applications for security, maintenance, monitoring, etc. Especially in IoT networks, the automatic detection of entities that present anomalous activity is crucial due to the vast size these networks can have, which makes infeasible their regular monitoring by technicians or experts.

Here we provide a sample dataset with the communication activity recorded from a the real Internet-of-Things (IoT) network deployed by Sigfox. It can be used for anomaly detection in communication networks.

The recorded activity concerns a small number of 38 base stations (BS, i.e. antennas) over a period of 5 months (January - May 2017). The data are organized in different .csv files under the name format '2017-##-##'. For the months February to May there are separate files with the activity of each day, while for January there is a single file for all days (we used that month as a training period). In each file, the columns correspond to BSs and the lines to communication events. Each such event is created by a broadcasting device that sends a message, which is then received by a number of nearby BSs. 

The first column of each file corresponds to the abnormal BS, which is a BS whose activity has been characterized by experts as anomalous (anomaly found between '2017-03-01' and '2017-05-30'). 

Additional details for the dataset are provided in [1].

The data were recorded by Sigfox engineers and prepared for general use by B. Le Bars and A. Kalogeratos.

The approximate positions of the Sigfox antennas are shown on the map of Toulouse, France, in the included image "sigfox-data-map-Toulouse.png".


### 2. How to use and cite this dataset ###

This dataset has been used in the experiments of the following paper: 

Batiste Le Bars and Argyris Kalogeratos, “**A Probabilistic Framework to Node-level Anomaly Detection in Communication Networks**“, IEEE International Conference on Computer Communications 2019 (INFOCOM), 29 Apr – 2 May 2019, Paris, France. Available at: <http://kalogeratos.com/psite/files/MyPapers/Node-level-Anomaly-Detection-INFOCOM19.pdf>.

Although in the experiments of the above all 38 BSs were used, due to a typo the original paper mentions that the used BSs were only 34. That has been corrected in the recompiled version in the provided link. 

The dataset is publicly available for research purposes under the terms of its license (see Sec. 3) and after proper mention of the source by citing directly the above paper (.bib file:  http://kalogeratos.com/psite/files/MyPapers/Node-level-Anomaly-Detection-INFOCOM19.bib)

Optionally, this web-page (<http://kalogeratos.com/psite/nad2019>) can be mentioned, too.

Copyright (C) 2018-2019, Batiste Le Bars and Argyris Kalogeratos. License details can be found in the respective section at the end of this document.


### 3. License ###

Copyright (C) 2018-2019, Batiste Le Bars and Argyris Kalogeratos. The *Sigfox IoT Dataset* is free dataset: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
     
The *Sigfox IoT Dataset* is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
     
You should have received a copy of the GNU General Public License along with this software in the file LICENSE.txt. If not, see <http://www.gnu.org/licenses/>.
    
Brief overview of the GNU GPL:

- Provides copyright protection: **True**
- Can be used in commercial applications: **True**
- Bug fixes / extensions must be released to the public domain: **True**
- Provides an explicit patent license: **False**
- Can be used in proprietary (closed source) applications: **False**
- Is a viral license: **True**

Other resources for the license:

- A quick guide to GPLv3: <https://www.gnu.org/licenses/quick-guide-gplv3.html>.
- Full text of GPLv3: <https://www.gnu.org/licenses/gpl-3.0.html>.


### 4. Acknowledgement ###

Special thanks to Olivier Isson who provided us with the raw activity recording and indicated BSs' operation anomalies therein. Also we would like to thank Sigfox for supporting this research and allowing this dataset to become publicly available.


### 5. References ###

[1] Batiste Le Bars and Argyris Kalogeratos, “*A Probabilistic Framework to Node-level Anomaly Detection in Communication Networks*“, IEEE International Conference on Computer Communications (INFOCOM), 29 Apr – 2 May 2019, Paris, France. Available at: <http://kalogeratos.com/psite/files/MyPapers/Node-level-Anomaly-Detection-INFOCOM19.pdf>. Use to cite: < http://kalogeratos.com/psite/files/MyPapers/Node-level-Anomaly-Detection-INFOCOM19.bib>

[2] *The Sigfox IoT Dataset*. 
 Available online at: <http://kalogeratos.com/psite/material/the-sigfox-iot-dataset/>.