This repo provides a dataset for IP Usage Scenarios prediction and codes of benchmarks as described in the paper:
Identifying IP Usage Scenarios: Problem, Data, and Benchmarks
Fan Zhou,Weifeng Zhang, Yong Wang, Ting Zhong, Goce Trajcevski and Ashfaq Khokhar.
Accepted by IEEE network
we have compressed the datasets named as dataset.zip , you could refer to documentation.xlsx for more details. For running, you should unzip the file to "./data".
Our experiments are conducted on Ubuntu 20.04, a single NVIDIA 1070Ti GPU, 32GB RAM, and Intel i7 8700K.
torch = '1.3.1',
numpy = '1.19.1',
sklearn = '0.23.1',
pandas = '1.0.5'
Here we take Beijing dataset as an example to demonstrate the usage.
Before running benchmarks, you should convert the string data to numerical data:
python cate2num.py
then, you will get the beijing_cate2id.
For DT and SVM, you could run the IP_ML.py, for D&CN and AutoInt, run the IP_DL.py, and for NODE, please run with the command line:
cd node_scenario
python node_scenario.py --dataset "beijing"
# the dataset parameter choice is ["beijing", "shanghai", "sichuan", "illinois"]
If you find our paper & code are useful for your research, please consider citing us:
@ARTICLE{9829369,
author={Zhou, Fan and Zhang, Weifeng and Wang, Yong and Zhong, Ting and Trajcevski, Goce and Khokhar, Ashfaq},
journal={IEEE Network},
title={Identifying IP Usage Scenarios: Problems, Data, and Benchmarks},
year={2022},
volume={36},
number={3},
pages={152-158},
doi={10.1109/MNET.012.2100293}}
We would like to thank DeepCTR for sharing their codes and SHAP for data analysing.
For any questions (as well as request for the pdf version) please open an issue or drop an email to: weifzh At outlook Dot com