Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataset #1

Open
cailv opened this issue Sep 24, 2023 · 3 comments
Open

dataset #1

cailv opened this issue Sep 24, 2023 · 3 comments

Comments

@cailv
Copy link

cailv commented Sep 24, 2023

Can you share data sets with different types of attacks? I will cite your paper

@happy-little-zhang
Copy link
Owner

Hello, friends! I appreciate your interest in my work. I have also experienced the challenges of attack simulations. I hope the following description will be helpful to you.

We have not provided a detailed description of the dataset used in the tests for the following reasons: the synthetic abnormal data used in the evaluation is not fixed and unchanging. We intentionally introduce random characteristics to simulate different types of attacks. As a result, each researcher who replicates our work will obtain a unique dataset of anomalies. Furthermore, the quantity of synthetic abnormal data is quite large, reaching hundreds of gigabytes, which would make it inconvenient to include in our test platform.

In our source code, there are functions for generating synthetic abnormal data. To access them, you can set the option "DELETE_TEMP_FILE" to "0" in the file "common/common.h". Afterward, simply run the main function, and the program will store the synthetic abnormal data, labels, and ground truth.

Please note that running the program with the current parameter configuration may take a long time and require significant storage. If you would like to conduct a quick demonstration test, you can set related parameters to their minimum values in the file "common/common.h", such as setting "MAX_FILE_NUM" to "1" and "ATTACK_NUM" to "1".

@happy-little-zhang
Copy link
Owner

If you are only interested in the synthetic abnormal data, you can set the option "MAX_SLIDING_WINDOW_SIZE" to "1".

To run the program, follow these steps:

Step 1: Place the raw data files (001.csv~035.csv) in the folder/dataset/raw.

Step 2: Uncomment several functions in the main.cpp file:

split_data_to_train_and_test();
global_model_train_and_model_detect_attack_free();
global_model_detect_under_various_attack();
Step 3: Run the program.

By performing these steps, you will be able to execute the program and observe the results related to the synthetic abnormal data.

@happy-little-zhang
Copy link
Owner

happy-little-zhang commented Sep 24, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants