Skip to content

A one-stop arms race for Machine Learning Exploitation tools (WIP)

Notifications You must be signed in to change notification settings

simplymathematics/exploitAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Exploit AI

A Tool for automating exploitation, detection and prevention on various network applications

Purpose

An increasingly automated world requiers more internet connected devices with fewer oversights. A world once managed by system administrators and firewall rules has been automated, outsourced, and become embedded. At the same time, these devices are notoriously insecure due to a lack of updates, limited system resources, and a lack of consumer awareness. The recent Mirai Botnet exploited a large number of such devices to launch secondary attacks from otherwise innocuous locations and users (Kumar et al 2019), so detecting these attacks in a centralized location is important. This same paper will also inform our data collection techniques.

Problem Statement

In general, machine learning models suffer from opacity to the researchers that deploy them. Sensitivity analysis, correlation plots, and low-dimensional graphs help highlight cross-sections of a model's inner-workings but fail to build systems that are robust to attack, free from bias, or human-interpretable. This project focuses on building models that are resistant to attack as a stepping stone towards a larger framework of trustworth artificial intelligence. In particular, I seek to develop a testing pipeline for generating, detecting, and classifying attacks on various machine learning models.

Background

Machine learning models are already being used to detect malicious traffic. However, these models are very vulnerable to attack. In general, those attacks fall into three broad categories:

  1. Inverting the model for the purposes of further attacks or espionage.
  2. Misclassification attacks use adversarial noise or other masking techniques to force the model to misclassify an attack.
  3. Poisoning is a method that attacks the data during training so that the correctly derived model produced false outputs.

Consequently, these categories must be approached differently.

Model Inversion

Given sufficient input/output pairs, all models are invertible. In the case of continuous functions (i.e. gradient descent), an auxiliary model can be trained and used as a discrinimator to train an attack mdoel. In the case of discontinuous optimization functions (i.e. decision trees), a continuous auxilary function can still be used. Any attacker "can attempt to solve mathematically forunknown parameters or features given the confidence value and equations by quering $d+1$ random $d$ dimensional inputs for unknown $d+1$ parameters" (Chakraborty 2018)

However, by randomizing the model available to the attacker, we can increase the search space for the attacker. While this works for small-scales, it seems infeasible to create more than a few models by hand. Automated generation techniques must create statisticially independent models or fall victim to auxilary function attacks above.

Other ensemble models should work similarly, with weights assigned dynamically or otherwise having linearly independent models. Regression tecniques are normally used in anomaly detection systems because they quickly separate an input space into two classes: normal and abnormal. However, any given model is very weak to attack because of the small number of trainable parameters. More research should be conducted towards a decision theoretical approach that uses an ensemble of these simple methods in order to build confidence.

Prevention:

  • Model Randomization
  • Ensemble Methods

Misclassification

Misclassification attacks rely on the realization that any input not enclosed in the input space, but sufficiently far from class boundaries, will always result in a high confidence classification for the attack. Reducing the bit depth of the input space (color-depth in an image, for example) was shown to reduce the propensity of these attacks, but at the cost of accuracy. Similarly, using pooling layers or other smoothing functions on the input space reduced adversarial noise, but at the cost of model accuracy. Collectively these are known as feature squeezing (Xu et al 2017). In safety critical applications like autonomous vehicles, this reduction in accuracy is not a justifiable tradeoff.

In many cases, clustering and classification algorithms rely only on distance from a boundary condition to determine a class label for some input. However, by explicitly defining intra- as well as inter-class boundaries, this problem goes away. However, drawing envelopes around a given class is tantamount to knowing the probability distributition of the population in adva ce-- the exact thing that our model seeks to dsicover. While this is handled in in low-dimensional data through simulation and Monte Carlo methods, this does not work for high dimensional data like images or very wide data tables associated with something like Netflow data. So, we must reduce dimensionality or generate new samples.

We can use classical machine learning techniques to do just that. Ghosh et al showed that Gaussian Mixture Variational Autoencoders can be used to detect adversarial samples by rejecting samples that have low probability of their class label or have especially high reconstruction error. Ghosh et al used Mahalanobis distance to calculate the distance between a sample at point P and a distributution D and set some explicit cut-off score to great effect. However, the authors admit that classes need to be well separated and balanced, which is rarely the case for anomaly detection systems. The autoencoder technique was shown to work well with images classification tasks, correctly rejecting 97% of adversarial examples.

Similarly, we can build a neural network to generate fake data and teach any other model to reject it. By doing this recursively, we can create a generative adversarial model that creates 'noise' that is indistinguishable from real data (to the original model). By labelling this new data, we can trust a model that is robust to this set of peturbations(Samanagouei et al) However, this merely creates a new hyperparameter set and starts the attack cycle with a new black box, even if it is computationally more expensive to attack. MagNet seems to be the most well-cited example that relies on a classifier and a randomly selected autoencoder (to make attacks harder).

In general, classifiers are trained such that there are k number of possible classes, with all classes represented in the data set. More research should be done into providing an explicit NULL dataset and how that effects model robustness.

Prevention:

  • Smoothing/Depth Reduction
  • Explicit Envelopes
    • Autoencoders
    • GANS

Poisoning

Poisoning attacks attack the model itself during training. All possible techniques for preventing this type of attack rely on detecting adversarial examples. The most concrete example is intentionally mislabelling data such that a stop sign is labelled as anything but in an attempt to confuse the model. Mechanical Turk, Captcha, and other labelling services likely prevent this by getting labels from many users and quantifying each users consensus accuracy. I can't find citations on this, but I highly doubt Google implicitly trusts my captcha labels and relies on the inputs of many users to establish a ground truth. When examining the ability of users to poison global ratings on the MovieLens data set, Li et al (2016) showed it was possible to poison even very large datasets in ways that made attackers statistically indistinguishable from real users. The authors go on to suggest that ensemble methods can reduce the efficacy of such attacks, which I discussed above.

However, here are various sampling techniques that can be used to generate more robust datasets. Mehta et al showed that various sampling techniques were useful in creating models that generalized well and with high accuracy. They use a variety of random, stratified, and gaussian sampling techniques with efficacy seeming to vary by dataset rather than by model. Perhaps generating independent models from different population samples is an appropriate way to move forward.

Complexity

In the case of linear models, svms, and other matrix-based models, it is often easier to build several models from a smaller data set than build a single large model because complexity scales with the size of the sample space in any algorithm that uses a covariance matrix or eigenvalues (which is to say, the vast majority of classical ML techniques including ICA, PCA, and SVMs).

In the case of neural networks, storage and processing complexity have limited research since their discovery in the 1960s. For each sample, each node represents a dot product and a non-linear activation function. This means it has complexity

$$

O ( i * j * k * n)

$$ ) where i is the column size, j is the network depth, k is the number of classes (output layer size), and n is the number of training samples. This is self evident since each feed forward propagation step requires a computationally intensive dot product on large vectors. At the same time, the activation function has complexity

$$

O (j * k * n)

$$

since it happens for each image at each node once, rather than scaling with image size. For simplicity, j can be a single integer, but for the sake of complexity, it can be replaced with the average number of nodes per layer in the case of non-uniform network layer sizes. Together, we see that

$$

O ( i * j * k * n + j * k * n)

$$

or that

$$

O ( j * k * n)(i + 1)

$$

We can simplify this further be assuming that the number of classes is much smaller than the sample size, node size, or data dimensions, meaning the k turn becomes negligible with scale. In this simplified form, it easy to see that model training time is a function primarily of model complexity and the number of training samples with significant but reduced influence from the dimensions of the data.

$$ O (j * n) (i + 1) $$

Furthermore, if we build a variety of models with similar model size (j), then the it becomes clear that the dominant term is the sample size rather than the data size, especially in the case of real world examples. $$

O (ni + n)

$$

Modern datasets have billions of images that are less than contain, at most, 1 million pixels of dimension. In practical usage, models are actually exposed to a much more compressed version of theses models. Resnet, trained on millions of images only takes in a 96 * 96 image with less than 10 thousand dimensions. Having a larger sample size than dimension size is often an assumption in machine learning and linear algebra, so it is a safe generalization here as well.

The important point is that training a bunch of parallel models on smaller datasets is not computationally more expensive than training one large model from a combine dataset. However, we know that model accuracy improves greatly with data scale, especially in the case of neural networks with a high number of parameters. Tons of resarch has gone into transferability, pre-training, and other computational shortcuts that could allow for high model accuracy despite a small number of traning samples. Further research will have to be conducted.

Exploitation

I tried to use the metasploit framework here to attack a large number of systems. This still requires considerable manual intervention in both setting up the victim system and executing the attack. The attack side automation is improved in the commercial version, but setting up servers to manually attack seems infeasible to start. The far more elegant solution is to treat the rest of the internet as adversarial and simply use systems with public facing IP addresses and no firewall rules, emulating the very edge networks that are the end-goal.

Detection

Snort is a framework for logging network traffic for various reasons. This includes a packet logger and sniffer mode, as well as a network intrusion detection system mode. Initial experiments will inform which variables to monitor on live systems (i.e. port, packet size, destination ip, etc). It will be particularly useful for the implementation stage because it has highly extensible rules that are designed to be controlled by the user, allowing different models to be tested in quickly and easily. In addition, Cisco has a commercial system called Netflow that has been used for almost a decade to help manage massive bandwidth loads and detect suspicious traffic in routers on edge networks. Their architecture can inform ours.

Prevention

Prevention requires a much larger project. In particular, any product must balance accuracy with false detection rates which will always be specific to a particular customer and circumstance. However, there are many automation techniques currently not explored in academic machine learning circles that could be explored in future work dedicated to a particular implementation.

  • Detection escalation could be used to reduce computational load and only escalated when a suspcious connection or process is detected.

  • By setting up a honeypot, we can detect known suspicious IP addresses and use publicly available BGP/DNS data to establish relatively trustworthy adresses.

  • By scanning source code or administrator passwords for entropy, we can create automated systems that prevent many attacks at the source ensuring that keys remain outside of public sources and that passwords are sufficiently random. Similar tools can be used to look for unsanitized inputs and known XSS vulnerabilities before code is pushed to a server.

Trust

In general, machine learning develop trust by consistently correct (by various accuracy metrics), by being intuitive (through model interpretability), and by being resistant to these adversarial attacks designed to be indetectable to humans but stupefying to machines. These attacks can be malicious or encode themselves in the data through sampling errors or researcher bias. Building models that are resistant to attack is crucial for a larger framework of artificial trust. I want to investigate significiant time into

Roadmap

  1. Choose data. Pick a few standard datasets from various disciplines (i.e. NMIST, Netflow, or biomedical imaging data) to test anomaly detection methods. Cleaning and storing the data properly takes some time, but reduces friction later.

  2. Recreate current research into anomaly detection models (the bulk of this is already done). Begin investigation of decision theory and ensemble decisions from a rigorous mathematical standpoint.

  3. Recreate current research on adversarial attacks as a way to train a model against known attacks. For linear methods, this is trivial given that I know the trained parameters and that they are limited in number.

For neural networks, this pipeline will be useful generating new data as well as replicating prior research.

  1. This pipeline will allow investigation of latent variables, data and researcher bias, and other hidden features by transforming the data into its latent space.

  2. Implement an edge-network system for monitoring traffic and logging suspicious behavior, incorporating the system design information learned above. This will allow me to gather massive amounts of real data on any open protocol and any IP address with a known registration, supplementing and growing already existing data sets that rely on well studied attacks. The advantage of this technique is we can catch previously unknown attack vectors by their resulting network and usage patterns rather than relying on perfect system administration.

  3. From here on out, we will evaluate sampling techniques, modelling techniques, and ensemble techniques that are useful for reliably finding new attacks using the extensible framework developed piecewise over the previous steps. Not only will this allow me to test many different models by quickly iterating through variations in each pipeline layer, but those same iterations will build a larger trustworthiness against bias, overfitting, and adversarial noise attacks.

Sources:

Ghost et al https://arxiv.org/pdf/1806.00081.pdf Xu et al https://arxiv.org/abs/1704.01155 Sharma, Yash and Pin-Yu Chen https://openreview.net/forum?id=SkV0ptkvf Yin et al https://openreview.net/forum?id=SJeQEp4YDH Samangouei et al https://arxiv.org/pdf/1805.06605.pdf Meng and Chen https://arxiv.org/abs/1705.09064 Li et al https://arxiv.org/pdf/1608.08182.pdf Kumar and Lim https://arxiv.org/pdf/1906.07175.pdf

About

A one-stop arms race for Machine Learning Exploitation tools (WIP)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published