Skip to content

Commit

Permalink
Update pytorch example (#185)
Browse files Browse the repository at this point in the history
* fix bug in scoring failure

* update power example to support bit-depth search

* update result directories

* revert changes to example/power

* add bit depth example

* revert directory changes for power example

* fixed directory for mnist in power example

* fixed directory for mnist in power example

* use the latest  torch torchvision torchaudio

* update workflow to push on PR

* removed double stage folders in log folder

* + more epochs for cifar100

* changed intervals from uniform to log uniform, made learning rate range larger

* strip whitespace, covert o numeric in compile script

* update git ignores

* suport nb_epoch as defence choice

* remove adv_success from requirements

* add "NaN" to nones

* update afr script

* update mnist .dvc cache

* updat cifar10 plots

* uncomment paretoset in plotting

* fix default defence bug and realtive pathing in compile script

* moved plots to subfolder

* better configuration support

* fix compile script bug

* update compile and plots yaml for power example

* fix compile bug

* update plots

* include plot files in dvc

* update afr to read from conf file

* linting

* linting

* update pytorch example

* update pytorch afr.yaml (not working)

* split cleaning from plotting, but only working for examples/pytorch/mnist

* working cleaning script

* fix pytorch examples with new clean script

* remove debug check from parse_results

* make deckard a depedendency of the parsing script

* made models.sh easier to read

* update afr for pytorch example

* update power example

* update dvc.lock for pytorch example

* update pytorch/cifar100

* update power/plots (not working)

* add docstrings to plots.py

* update power example with merge script

* add power data

* update configs

* add combined plots

* update afr models

* added support for dummy variables in afr

* ++combined_plots.py and fix afr bug

* add cifar100 l4 power data with commenting everything else

* add varepsilon to attack params

* add dummy variables

* fix rounding bug

* update to newest plots

* newest plots for power example

* linting

* removed old afr file

* linting

* Merge branch 'fix-compile-script' of github.com:simplymathematics/deckard into fix-compile-script

* update conf

* fixed kepler script bug

* linting

* linting

* linting

* linting

* linting

* linting

* linting

* linting

* linting

* fixed cifar100 pytorch example script

* more resilient wait and cleaning scripts

* +GZIP example

* bug fixes

* fix latex nan bug

* bug fixes

* add index to compilation csv

* better defence merging

* fixed bug where x,y scale are None

* update cifar100 confs

* fixed cleaning bug

* fixed afr plot rendering bug

* add check for negative predict time

* update configs

* fixed failure rate bug and updated confs

* change plot default from eps to pdf

* fix bug in calculating failure rate when attack size != train size

* update mnist confs

* specify attack size at the command line

* linting

* update all plot configs

* Fix compile script (#172)

* fix bug in scoring failure

* update power example to support bit-depth search

* update result directories

* revert changes to example/power

* add bit depth example

* revert directory changes for power example

* fixed directory for mnist in power example

* fixed directory for mnist in power example

* use the latest  torch torchvision torchaudio

* update workflow to push on PR

* removed double stage folders in log folder

* + more epochs for cifar100

* changed intervals from uniform to log uniform, made learning rate range larger

* strip whitespace, covert o numeric in compile script

* update git ignores

* suport nb_epoch as defence choice

* remove adv_success from requirements

* add "NaN" to nones

* update afr script

* update mnist .dvc cache

* updat cifar10 plots

* uncomment paretoset in plotting

* fix default defence bug and realtive pathing in compile script

* moved plots to subfolder

* better configuration support

* fix compile script bug

* update compile and plots yaml for power example

* fix compile bug

* update plots

* include plot files in dvc

* update afr to read from conf file

* linting

* linting

* update pytorch example

* update pytorch afr.yaml (not working)

* split cleaning from plotting, but only working for examples/pytorch/mnist

* working cleaning script

* fix pytorch examples with new clean script

* remove debug check from parse_results

* make deckard a depedendency of the parsing script

* made models.sh easier to read

* update afr for pytorch example

* update power example

* update dvc.lock for pytorch example

* update pytorch/cifar100

* update power/plots (not working)

* add docstrings to plots.py

* update power example with merge script

* add power data

* update configs

* add combined plots

* update afr models

* added support for dummy variables in afr

* ++combined_plots.py and fix afr bug

* add cifar100 l4 power data with commenting everything else

* add varepsilon to attack params

* add dummy variables

* fix rounding bug

* update to newest plots

* newest plots for power example

* linting

* removed old afr file

* linting

* Merge branch 'fix-compile-script' of github.com:simplymathematics/deckard into fix-compile-script

* update conf

* fixed kepler script bug

* linting

---------

Co-authored-by: Mohammad Reza Saleh <[email protected]>
Co-authored-by: salehsedghpour <[email protected]>

* added dummy vars, fixed plots

* fix afr.py bugs

* bad merge?

* merge

* fix bugs

* linting

* linting

* linting

* linting

* linting

* linting

* update linter

* update linter

* update linter

* update linter

* linting

* update setup, .gitignore

* fix failure rate bug (again)

* most up-to-date plots

* update failure rate from h to f in pytorch examples

* remove intercept and scale parameters from afr plots

* remove rows where the score is an error

* update plolt confs for pytorch example

* allow setting filename from command line of AFR script

* plot legend tweaks

* linting

* linting

* Update Dockerfile

* update dockerfile

* lintin

* update gzip configs

* better logging

* add url validation for data pipeline

* git rm

* update truthseeker yaml

* add gzip .gitignore

* gzip dvc changes

* add sampling during training

* more resilient find_best script

* fixed bug with finding min/max when data is non-numeric

* add support for url/local datasets

* add column dropping for data parsing

* find best for multi-objective search

* better cleaning for experiments without attacks

* better filetype support when plotting

* load distance matrix from disk (optionally)

* update confs for gzip

* update default params

* update .gitignore, add some models to torch_example

* refactor confs

* update pytorch confs

* minor bug fixes

* fix small bugs

* add resnet examples

* update pytorch experiment confs

* config changes

* increase timeline resolution

* config changes

* removed dvc cruft

* update cost normalization calculation

* update afr plotting

* remove partial effects from pytorch config, add support for aalen additive models

* update dvc file for pytorch plots

* better error handling

* update survival plots

* ++plots to newest overleaf

* dummy config changes

* config change

* re-run plot dvc.yaml

* config changes

* update .gitignore

* fixed file bug

* fix keyword bug

* change Coefficient plots to sym log scale

* and latex to result parsing

* added support for forloop stage parsing

* streamline some code

* config changes + predicting the metric with a model config chosen by key

* add plots yaml again

* stop tracking cifar100.yaml

* fix uncaught exception

* make dataset formatting more robust

* add a type check

* update default configs for each dataset to use env vars instead of hard-coded values for the number of jobs

* reconfigure the dvc pipeline for re-running and changing the number of jobs + adversarial success

* config changes

* better pytorch out of memory handling

* add normalization to trash metric

* better convergence error handline

* config changes

* linting

* linting

* stop tracking cifar100.yaml

* use pretrained models as initial weights

* better error handling

* remove cruft

* delete old configs

* rename parameter for clarity

* moved from plots to conf folder

* update dvc to work with last commit

* config changes for pytorch example

* linting

* update torch example to use nb_epochs instead of nb_epoch

* linting

* config updates

* fixed bad merge

* linting

* update .gitignore

* stop tracking params file

* removed overly verbose logging

* broke up attack scripts for better dvc tracking

* update pytorch confs

* add hashable object, better art type checking

* created hashable object for inheritance

* changed AFR to AFT

* add arbitrary set() dictionary to catplot

* add numeric casting to afr

* fix logging bug

* linting

* better art typing

* hashable object

* linting

* fixed hashing bug

* fix bug

* linting

---------

Co-authored-by: Mohammad Reza Saleh <[email protected]>
Co-authored-by: salehsedghpour <[email protected]>
  • Loading branch information
3 people authored Aug 13, 2024
1 parent 6fa174c commit 1f6ca37
Show file tree
Hide file tree
Showing 32 changed files with 217 additions and 2,847 deletions.
5 changes: 0 additions & 5 deletions deckard/base/attack/attack.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,6 @@ def __init__(
self.attack_size = attack_size
self.init = AttackInitializer(model, name, **init)
self.kwargs = kwargs
logger.info("Instantiating Attack with id: {}".format(self.__hash__()))

def __hash__(self):
return int(my_hash(self), 16)
Expand Down Expand Up @@ -300,7 +299,6 @@ def __init__(
self.attack_size = attack_size
self.init = AttackInitializer(model, name, **init)
self.kwargs = kwargs
logger.info("Instantiating Attack with id: {}".format(self.__hash__()))

def __hash__(self):
return int(my_hash(self), 16)
Expand Down Expand Up @@ -493,7 +491,6 @@ def __init__(
self.attack_size = attack_size
self.init = AttackInitializer(model, name, **init)
self.kwargs = kwargs
logger.info("Instantiating Attack with id: {}".format(self.__hash__()))

def __hash__(self):
return int(my_hash(self), 16)
Expand Down Expand Up @@ -618,7 +615,6 @@ def __init__(
f"kwargs must be of type DictConfig or dict. Got {type(kwargs)}",
)
self.kwargs = kwargs
logger.info("Instantiating Attack with id: {}".format(self.__hash__()))

def __hash__(self):
return int(my_hash(self), 16)
Expand Down Expand Up @@ -813,7 +809,6 @@ def __init__(
kwargs.update(**kwargs.pop("kwargs"))
self.kwargs = kwargs
self.name = name if name is not None else my_hash(self)
logger.info("Instantiating Attack with id: {}".format(self.name))

def __call__(
self,
Expand Down
1 change: 0 additions & 1 deletion deckard/base/data/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,6 @@ def save(self, data, filename):
:param filename: str
"""
if filename is not None:
logger.info(f"Saving data to {filename}")
suffix = Path(filename).suffix
Path(filename).parent.mkdir(parents=True, exist_ok=True)
if isinstance(data, dict):
Expand Down
9 changes: 0 additions & 9 deletions deckard/base/data/generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,6 @@ class SklearnDataGenerator:
kwargs: dict = field(default_factory=dict)

def __init__(self, name, **kwargs):
logger.info(
f"Instantiating {self.__class__.__name__} with name={name} and kwargs={kwargs}",
)
self.name = name
self.kwargs = {k: v for k, v in kwargs.items() if v is not None}

Expand Down Expand Up @@ -91,9 +88,6 @@ class TorchDataGenerator:
kwargs: dict = field(default_factory=dict)

def __init__(self, name, path=None, **kwargs):
logger.info(
f"Instantiating {self.__class__.__name__} with name={name} and kwargs={kwargs}",
)
self.name = name
self.path = path
self.kwargs = {k: v for k, v in kwargs.items() if v is not None}
Expand Down Expand Up @@ -179,9 +173,6 @@ class KerasDataGenerator:
kwargs: dict = field(default_factory=dict)

def __init__(self, name, **kwargs):
logger.info(
f"Instantiating {self.__class__.__name__} with name={name} and kwargs={kwargs}",
)
self.name = name
self.kwargs = {k: v for k, v in kwargs.items() if v is not None}

Expand Down
2 changes: 0 additions & 2 deletions deckard/base/model/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ def __init__(self, **kwargs):
self.kwargs = kwargs

def __call__(self, data: list, model: object, library=None):
logger.info(f"Training model {model} with fit params: {self.kwargs}")
device = str(model.device) if hasattr(model, "device") else "cpu"
trainer = self.kwargs
if library in sklearn_dict.keys():
Expand All @@ -91,7 +90,6 @@ def __call__(self, data: list, model: object, library=None):
try:
start = process_time_ns()
start_timestamp = time()
logger.info(f"Fitting type(model): {type(model)} with kwargs {trainer}")
model.fit(data[0], data[2], **trainer)
end = process_time_ns()
end_timestamp = time()
Expand Down
30 changes: 13 additions & 17 deletions deckard/base/model/sklearn_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,16 +57,10 @@ class SklearnModelPipelineStage:
kwargs: dict = field(default_factory=dict)

def __init__(self, name, stage_name, **kwargs):
logger.debug(
f"Instantiating {self.__class__.__name__} with name={name} and kwargs={kwargs}",
)
self.name = name
self.kwargs = kwargs
self.stage_name = stage_name

def __hash__(self):
return int(my_hash(self), 16)

def __call__(self, model):
logger.debug(
f"Calling SklearnModelPipelineStage with name={self.name} and kwargs={self.kwargs}",
Expand All @@ -76,7 +70,7 @@ def __call__(self, model):
stage_name = self.stage_name if self.stage_name is not None else name
while "kwargs" in kwargs:
kwargs.update(**kwargs.pop("kwargs"))
if "art." in str(type(model)):
if str(type(model)).startswith("art."):
assert isinstance(
model.model,
BaseEstimator,
Expand All @@ -102,7 +96,6 @@ class SklearnModelPipeline:
pipeline: Dict[str, SklearnModelPipelineStage] = field(default_factory=dict)

def __init__(self, **kwargs):
logger.debug(f"Instantiating {self.__class__.__name__} with kwargs={kwargs}")
pipe = {}
while "kwargs" in kwargs:
pipe.update(**kwargs.pop("kwargs"))
Expand Down Expand Up @@ -145,12 +138,12 @@ def __len__(self):
else:
return 0

def __hash__(self):
return int(my_hash(self), 16)

def __iter__(self):
return iter(self.pipeline)

def __hash__(self):
return int(my_hash(self), 16)

def __call__(self, model):
params = deepcopy(asdict(self))
pipeline = params.pop("pipeline")
Expand All @@ -172,7 +165,7 @@ def __call__(self, model):
elif isinstance(stage, SklearnModelPipelineStage):
model = stage(model=model)
elif hasattr(stage, "fit"):
if "art." in str(type(model)):
if str(type(model)).startswith("art."):
assert isinstance(
model.model,
BaseEstimator,
Expand All @@ -184,12 +177,15 @@ def __call__(self, model):
), f"model must be a sklearn estimator. Got {type(model)}"
if not isinstance(model, Pipeline) and "art." not in str(type(model)):
model = Pipeline([("model", model)])
elif "art." in str(type(model)) and not isinstance(
elif str(type(model)).startswith("art.") and not isinstance(
model.model,
Pipeline,
):
model.model = Pipeline([("model", model.model)])
elif "art." in str(type(model)) and isinstance(model.model, Pipeline):
elif str(type(model)).startswith("art.") and isinstance(
model.model,
Pipeline,
):
model.model.steps.insert(-2, [stage, model.model])
else:
model.steps.insert(-2, [stage, model])
Expand All @@ -213,6 +209,9 @@ class SklearnModelInitializer:
pipeline: SklearnModelPipeline = field(default_factory=None)
kwargs: Union[dict, None] = field(default_factory=dict)

def __hash__(self):
return int(my_hash(self), 16)

def __init__(self, data, model=None, library="sklearn", pipeline={}, **kwargs):
self.data = data
self.model = model
Expand Down Expand Up @@ -267,6 +266,3 @@ def __call__(self):
"fit",
), f"model must have a fit method. Got type {type(model)}"
return model

def __hash__(self):
return int(my_hash(self), 16)
8 changes: 7 additions & 1 deletion deckard/base/utils/hashing.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from hashlib import md5
from collections import OrderedDict
from typing import NamedTuple, Union
from dataclasses import asdict, is_dataclass
from dataclasses import asdict, is_dataclass, dataclass
from omegaconf import DictConfig, OmegaConf, SCMode, ListConfig
from copy import deepcopy
import logging
Expand Down Expand Up @@ -71,3 +71,9 @@ def to_dict(obj: Union[dict, OrderedDict, NamedTuple]) -> dict:

def my_hash(obj: Union[dict, OrderedDict, NamedTuple]) -> str:
return md5(str(to_dict(obj)).encode("utf-8")).hexdigest()


@dataclass
class Hashable:
def __hash__(self):
return int(my_hash(self), 16)
10 changes: 6 additions & 4 deletions deckard/layers/afr.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,11 @@ def ccl(p):
ax = plt.gca()
T = model.duration_col
E = model.event_col

# Cast df to numeric DataFrame
for col in df.columns:
df[col] = pd.to_numeric(df[col], errors="raise")
# Drop NaNs
df = df.dropna()
predictions_at_t0 = np.clip(
1 - model.predict_survival_function(df, times=[t0]).T.squeeze(),
1e-10,
Expand Down Expand Up @@ -347,8 +351,6 @@ def plot_aft(
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
ax.set_title(title)
# symlog-scale the x-axis
# ax.set_xscale("linear")
ax.get_figure().tight_layout()
ax.get_figure().savefig(file)
plt.gcf().clear()
Expand Down Expand Up @@ -624,7 +626,7 @@ def make_afr_table(
pretty_dataset = dataset.upper()
aft_data = aft_data.round(2)
aft_data.to_csv(folder / "aft_comparison.csv")
logger.info(f"Saved AFR comparison to {folder / 'aft_comparison.csv'}")
logger.info(f"Saved AFT comparison to {folder / 'aft_comparison.csv'}")
aft_data = aft_data.round(2)
aft_data.fillna("--", inplace=True)
aft_data.to_latex(
Expand Down
3 changes: 1 addition & 2 deletions deckard/layers/clean_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ def drop_rows_without_results(
logger.info(f"Shape of data before data before dropping na: {data.shape}")
data.dropna(axis=0, subset=[col], inplace=True)
after = data.shape[0]
logger.info(f"Shape of data before data after dropping na: {data.shape}")
logger.info(f"Shape of data after data after dropping na: {data.shape}")
percent_change = (before - after) / before * 100
if percent_change > 5:
# input(f"{percent_change:.2f}% of data dropped for {col}. Press any key to continue.")
Expand Down Expand Up @@ -593,7 +593,6 @@ def clean_data_for_plotting(
data = fill_na(data, fillna)
data = replace_strings_in_data(data, replace_dict)
data = replace_strings_in_columns(data, col_replace_dict)

if len(pareto_dict) > 0:
data = find_pareto_set(data, pareto_dict)
return data
Expand Down
3 changes: 3 additions & 0 deletions deckard/layers/plots.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ def cat_plot(
file = Path(file).with_suffix(filetype)
logger.info(f"Rendering graph {file}")
data = digitize_cols(data, digitize)
set_ = kwargs.pop("set", {})
if hue is not None:
data = data.sort_values(by=[hue, x, y])
logger.debug(
Expand Down Expand Up @@ -162,6 +163,8 @@ def cat_plot(
graph.set(xlim=x_lim)
if y_lim is not None:
graph.set(ylim=y_lim)
if len(set_) > 0:
graph.set(**set_)
graph.tight_layout()
graph.savefig(folder / file)
plt.gcf().clear()
Expand Down
16 changes: 8 additions & 8 deletions examples/power/conf/afr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ fillna:
weibull:
plot:
file : weibull_aft.pdf
title : Weibull AFR Model
title : Weibull AFT Model
labels:
"Intercept: rho_": "$\\rho$"
"Intercept: lambda_": "$\\lambda$"
Expand All @@ -36,7 +36,7 @@ weibull:
- "file": "weibull_epochs_partial_effect.pdf"
"covariate_array": "model.trainer.np_epochs"
"values_array": [1,10,25,50]
"title": "$S(t)$ for Weibull AFR"
"title": "$S(t)$ for Weibull AFT"
"ylabel": "$\\mathbb{P}~(T>t)$"
"xlabel": "Time $t$ (seconds)"
"legend_kwargs": {
Expand All @@ -46,7 +46,7 @@ weibull:
cox:
plot:
file : cox_aft.pdf
title : Cox AFR Model
title : Cox AFT Model
labels:
"data.sample.random_state": "Random State"
"atk_value": "Attack Strength"
Expand All @@ -65,7 +65,7 @@ cox:
- "file": "cox_epochs_partial_effect.pdf"
"covariate_array": "model.trainer.np_epochs"
"values_array": [1,10,25,50]
"title": "$S(t)$ for Cox AFR"
"title": "$S(t)$ for Cox AFT"
"ylabel": "$\\mathbb{P}~(T>t)$"
"xlabel": "Time $t$ (seconds)"
"legend_kwargs": {
Expand All @@ -75,7 +75,7 @@ cox:
log_logistic:
plot:
file : log_logistic_aft.pdf
title : Log logistic AFR Model
title : Log logistic AFT Model
labels:
"Intercept: beta_": "$\\beta$"
"Intercept: alpha_": "$\\alpha$"
Expand All @@ -96,7 +96,7 @@ log_logistic:
- "file": "log_logistic_epochs_partial_effect.pdf"
"covariate_array": "model.trainer.np_epochs"
"values_array": [1,10,25,50]
"title": "$S(t)$ for Log-Logistic AFR"
"title": "$S(t)$ for Log-Logistic AFT"
"ylabel": "$\\mathbb{P}~(T>t)$"
"xlabel": "Time $t$ (seconds)"
"legend_kwargs": {
Expand All @@ -106,7 +106,7 @@ log_logistic:
log_normal:
plot:
file : log_normal_aft.pdf
title : Log Normal AFR Model
title : Log Normal AFT Model
labels:
"Intercept: sigma_": "$\\sigma$"
"Intercept: mu_": "$\\mu$"
Expand All @@ -127,7 +127,7 @@ log_normal:
- "file": "log_normal_epochs_partial_effect.pdf"
"covariate_array": "model.trainer.np_epochs"
"values_array": [1,10,25,50]
"title": "$S(t)$ for Log-Normal AFR"
"title": "$S(t)$ for Log-Normal AFT"
"ylabel": "$\\mathbb{P}~(T>t)$"
"xlabel": "Time $t$ (seconds)"
"legend_kwargs": {
Expand Down
Loading

0 comments on commit 1f6ca37

Please sign in to comment.