Network_Anomaly_Detection_CIRCE

Coded Network Anomaly Detection application customized for Jupiter Orchestrator (available here: https://github.com/ANRGUSC/Jupiter).

Network Anomaly Detection: Task Graph

The application task graph, shown below, is intended for dispersed computing. It is inspired from Hashdoop [1, 2], where a MapReduce framework is used for anomaly detection. We have modified the codes from [2] to suit our purpose.

Generating the input files

Convert the pcap file to a text file using Ipsumdump as follows:

ipsumdump -tsSdDlpF -r botnet-capture-20110810-neris.pcap > botnet_summary.ipsum

Single input file, 1botnet.ipsum, is given in a repository.

Code Structure

Jupiter accepts pipelined computations described in a form of a Graph where the main task flow is represented as a Directed Acyclic Graph(DAG). Thus, one should be able separate the graph into two pieces, the DAG part and non-DAG part. Jupiter requires that each task in the DAG part of the graph to be written as a Python function in a separate file under the scripts folder. On the other hand the non-DAG tasks can be either Python function or a shell script with any number of arguments, located under the scripts folder. The folder structure is:

├── configuration.txt
├── DAG_directed.jpg
├── DAG.jpg
├── input_node.txt
├── LICENSE.txt
├── README.md
├── sample_input
│   ├── 1botnet.ipsum
│   └── 2botnet.ipsum
└── scripts
    ├── aggregate0.py
    ├── aggregate1.py
    ├── aggregate2.py
    ├── astutedetector0.py
    ├── astutedetector1.py
    ├── astutedetector2.py
    ├── config.json
    ├── dftdetector0.py
    ├── dftdetector1.py
    ├── dftdetector2.py
    ├── dftslave.py
    ├── fusioncenter0.py
    ├── fusioncenter1.py
    ├── fusioncenter2.py
    ├── globalfusion.py
    ├── localpro.py
    ├── masterServerStatic.py
    ├── parseFile.py
    ├── simpledetector0.py
    ├── simpledetector1.py
    ├── simpledetector2.py
    ├── teradetector0.py
    ├── teradetector1.py
    ├── teradetector2.py
    └── terasort
        ├── master
        │   ├── standby.sh
        │   ├── ...
        |
        └── worker
            └── rate-limit-then-standby.sh

`sample_input`

This folder is required according to the Jupiter guidelines as well as for testing. One can leave it as an empty directory.

`config.json`

According to Jupiter guideline, there MUST be a config.json file with an entry called taskname_map to designate the DAG part and non DAG part of the task graph. In this, each entry is represented as follows:

<task_name> <task_file> <DAG_flag>  <Arguments>

<task_name> is the name of the dag task such as simpledetector0.

<task_file> is the name of the file to be run internally

<DAG_flag> represents whether the task is part of the DAG. If yes, set it to be True otherwise set it to be False.

<Arguments> represents the arguments for the tasks that are not part of the DAG portion of the task-graph.

For example, as aggreagate0 is part of the DAG, the entry should be as follows.

"aggregate0"       : ["aggregate0", true]

On the other hand, dftslave00 is not part of the DAG portion, then the entry should be as follows:

"dftslave00"       : ["dftslave", false, 0, 0, 3]

Here the arguments 0,0,3 are required by the dftslave script and represents the masterid, slaveid, and max number of slaves, respectively.

`configuration.txt`

To conform with Jupiter guidelines, there MUST be a configuration.txt file representing the graph. Each line of the file represents the children of each node. To prepare the configuration file, first the task graph needs to be converted to an arbitrary directed graph. It can be done by running a Breadth First Search (BFS) type algorithm. Then you get a graph something like:

Now each of the non-leaf nodes and the leaf nodes that are part of original DAG part are represented in the configuration files as

<node-name>  <number for inputs required> <Flag stating whether to wait for all the inputs> <child1-name>  <child2-name> ....

For all the other nodes (i.e, leaf nodes that are not part of the original DAG portion) such as DFTslave00 represent them as

<node-name>    1       False   <node-name>

`input_node.txt`

This file is required by the WAVE scheduler. This basically initiates the WAVE scheduler's random mapping. This file includes random mapping for the tasks that has no parent. For example, the given input_node.txt file randomly selects node4 for the localpro task.

`script`

This folder contains all the executables related to the task graph.

`local_pro.py`

Process the Ipsum file locally and split the traffic into multiple independent streams based on the hash value of the IP adresses.

`aggregate<SPLIT_ID>.py`

SPLIT_ID (0, 1 or 2, in our case) uniquely idenfifies the split. This script aggregates traffic for a particular traffic split from different monitoring nodes.

`simple_detector<SPLIT_ID>.py`

A simple threshold based anomaly detector for the particular split.

`astute_detector<SPLIT_ID>.py`

An implementation of ASTUTE anomaly detector [3] from the repository [2].

`dftdetector<SPLIT_ID>.py`

An implementation of DFT detector (master) that communicates with DFT slave nodes to perform coded/uncoded Discrete Fourier transform analysis of the data.

`dftslave.py`

A common implementation of the DFT slave detectors that is called for all the DFTslave## tasks.

`terasort/master`

This folder contains all the script required to run Terasort master that coordinates the coded/uncoded terasort analysis of the data among the Terasort workers.

`terasort/worker`

This folder contains all the script required to run Terasort worker tasks.

`teradetector<SPLIT_ID>.py`

As the terasort master and worker task nodes are not directly part of the DAG, we have added this extra task to connect them to the DAG portion of the graph. This script dispatches the input data to the teramaster<SPLIT_ID> for processing and collects the output of the analysis and pass it to the Fusion Center.

`fusion_center<SPLIT_ID>.py`

Combine the detected anomalies by different detectors for the particular split.

`global_fusion.py`

Collect all the anomalies from different splits and combine the detected anomalies.

Execution

This code is customized to be executed with Jupiter Only.

References

[1] Romain Fontugne, Johan Mazel, and Kensuke Fukuda. "Hashdoop: A mapreduce framework for network anomaly detection." Computer Communications Workshops (INFOCOM WORKSHOPS), IEEE Conference on. IEEE, 2014.

[2] Hashdoop GitHub Repository

[3] Fernando Silveira, Christophe Diot, Nina Taft, and Ramesh Govindan. "ASTUTE: Detecting a different class of traffic anomalies." ACM SIGCOMM Computer Communication Review 40.4 (2010): 267-278.

Acknowledgment

This material is based upon work supported by Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001117C0053. Any views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Network_Anomaly_Detection_CIRCE

Network Anomaly Detection: Task Graph

Generating the input files

Code Structure

`sample_input`

`config.json`

`configuration.txt`

`input_node.txt`

`script`

`local_pro.py`

`aggregate<SPLIT_ID>.py`

`simple_detector<SPLIT_ID>.py`

`astute_detector<SPLIT_ID>.py`

`dftdetector<SPLIT_ID>.py`

`dftslave.py`

`terasort/master`

`terasort/worker`

`teradetector<SPLIT_ID>.py`

`fusion_center<SPLIT_ID>.py`

`global_fusion.py`

Execution

References

Acknowledgment

About

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
sample_input		sample_input
scripts		scripts
.gitignore		.gitignore
DAG.jpg		DAG.jpg
DAG_directed.jpg		DAG_directed.jpg
LICENSE.txt		LICENSE.txt
README.md		README.md
app_config.ini		app_config.ini
configuration.txt		configuration.txt
input_node.txt		input_node.txt

License

ANRGUSC/Coded-DNAD

Folders and files

Latest commit

History

Repository files navigation

Network_Anomaly_Detection_CIRCE

Network Anomaly Detection: Task Graph

Generating the input files

Code Structure

sample_input

config.json

configuration.txt

input_node.txt

script

local_pro.py

aggregate<SPLIT_ID>.py

simple_detector<SPLIT_ID>.py

astute_detector<SPLIT_ID>.py

dftdetector<SPLIT_ID>.py

dftslave.py

terasort/master

terasort/worker

teradetector<SPLIT_ID>.py

fusion_center<SPLIT_ID>.py

global_fusion.py

Execution

References

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`sample_input`

`config.json`

`configuration.txt`

`input_node.txt`

`script`

`local_pro.py`

`aggregate<SPLIT_ID>.py`

`simple_detector<SPLIT_ID>.py`

`astute_detector<SPLIT_ID>.py`

`dftdetector<SPLIT_ID>.py`

`dftslave.py`

`terasort/master`

`terasort/worker`

`teradetector<SPLIT_ID>.py`

`fusion_center<SPLIT_ID>.py`

`global_fusion.py`

Packages