Skip to content

Commit

Permalink
Implements alternative comparsion techniques
Browse files Browse the repository at this point in the history
  • Loading branch information
vulder committed Sep 26, 2023
1 parent 68fb8c0 commit 3a6c5d3
Show file tree
Hide file tree
Showing 3 changed files with 107 additions and 70 deletions.
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
benchbuild>=6.6.4
click>=8.1.3
cliffs-delta>=1.0.0
distro>=1.5.0
graphviz>=0.14.2
ijson>=3.1.4
Expand Down
1 change: 1 addition & 0 deletions varats/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
"tabulate>=0.9",
"varats-core>=13.0.5",
"wllvm>=1.3.1",
"cliffs-delta>=1.0.0",
],
author="Florian Sattler",
author_email="[email protected]",
Expand Down
175 changes: 105 additions & 70 deletions varats/varats/data/databases/feature_perf_precision_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import numpy as np
import pandas as pd
from cliffs_delta import cliffs_delta

Check failure on line 10 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L10

error: Skipping analyzing "cliffs_delta": module is installed, but missing library stubs or py.typed marker [import]
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:10:1: error: Skipping analyzing "cliffs_delta": module is installed, but missing library stubs or py.typed marker  [import]

Check failure on line 10 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L10

note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:10:1: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
from scipy.stats import ttest_ind

import varats.experiments.vara.feature_perf_precision as fpp
Expand Down Expand Up @@ -123,6 +124,107 @@ def get_matching_event(
return feature_performances


def precise_pim_regression_check(

Check failure on line 127 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L127 <116>

Missing function or method docstring (missing-function-docstring)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:127:0: C0116: Missing function or method docstring (missing-function-docstring)
baseline_pim: tp.DefaultDict[str, tp.List[int]],
current_pim: tp.DefaultDict[str, tp.List[int]]
) -> bool:
is_regression = False

for feature, old_values in baseline_pim.items():
if feature in current_pim:
if feature == "Base":
# The regression should be identified in actual feature code
continue

new_values = current_pim[feature]
ttest_res = ttest_ind(old_values, new_values)

# TODO: check, maybe we need a "very small value cut off"

Check warning on line 142 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L142 <511>

TODO: check, maybe we need a "very small value cut off" (fixme)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:142:13: W0511: TODO: check, maybe we need a "very small value cut off" (fixme)
if ttest_res.pvalue < 0.05:
# print(
# f"{self.name} found regression for feature {feature}."
# )
is_regression = True
else:
print(f"Could not find feature {feature} in new trace.")
# TODO: how to handle this?

Check warning on line 150 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L150 <511>

TODO: how to handle this? (fixme)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:150:13: W0511: TODO: how to handle this? (fixme)
raise NotImplementedError()
is_regression = True

Check failure on line 152 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L152

error: Statement is unreachable [unreachable]
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:152:13: error: Statement is unreachable  [unreachable]

Check warning on line 152 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L152 <101>

Unreachable code (unreachable)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:152:12: W0101: Unreachable code (unreachable)

return is_regression


def cliffs_delta_pim_regression_check(

Check failure on line 157 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L157 <116>

Missing function or method docstring (missing-function-docstring)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:157:0: C0116: Missing function or method docstring (missing-function-docstring)
baseline_pim: tp.DefaultDict[str, tp.List[int]],
current_pim: tp.DefaultDict[str, tp.List[int]]
) -> bool:
is_regression = False

for feature, old_values in baseline_pim.items():
if feature in current_pim:
if feature == "Base":
# The regression should be identified in actual feature code
continue

new_values = current_pim[feature]
d, res = cliffs_delta(old_values, new_values)

Check failure on line 170 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L170 <103>

Variable name "d" doesn't conform to snake_case naming style (invalid-name)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:170:12: C0103: Variable name "d" doesn't conform to snake_case naming style (invalid-name)

Check warning on line 170 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L170 <612>

Unused variable 'd' (unused-variable)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:170:12: W0612: Unused variable 'd' (unused-variable)

# print(f"{d=}, {res=}")

# if d > 0.70 or d < -0.7:
if res == "large":
# print(
# f"{self.name} found regression for feature {feature}."
# )
is_regression = True
else:
print(f"Could not find feature {feature} in new trace.")
# TODO: how to handle this?

Check warning on line 182 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L182 <511>

TODO: how to handle this? (fixme)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:182:13: W0511: TODO: how to handle this? (fixme)
raise NotImplementedError()
is_regression = True

Check failure on line 184 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L184

error: Statement is unreachable [unreachable]
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:184:13: error: Statement is unreachable  [unreachable]

Check warning on line 184 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L184 <101>

Unreachable code (unreachable)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:184:12: W0101: Unreachable code (unreachable)

return is_regression


def sum_pim_regression_check(

Check failure on line 189 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L189 <116>

Missing function or method docstring (missing-function-docstring)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:189:0: C0116: Missing function or method docstring (missing-function-docstring)
baseline_pim: tp.DefaultDict[str, tp.List[int]],
current_pim: tp.DefaultDict[str, tp.List[int]]
) -> bool:
# TODO: add some tests

Check warning on line 193 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L193 <511>

TODO: add some tests (fixme)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:193:5: W0511: TODO: add some tests (fixme)
baseline_pim_totals: tp.List[tp.List[int]] = [
old_values for feature, old_values in baseline_pim.items()
if feature != "Base"
]
print(f"{baseline_pim_totals=}")
current_pim_totals: tp.List[tp.List[int]] = [
current_values for feature, current_values in current_pim.items()
if feature != "Base"
]
print(f"{current_pim_totals=}")

baseline_pim_total: tp.List[int] = [
sum(values) for values in zip(*baseline_pim_totals)
]
print(f"{baseline_pim_total=}")
current_pim_total: tp.List[int] = [
sum(values) for values in zip(*current_pim_totals)
]
print(f"{current_pim_total=}")

# TODO: does not work for large numbers

Check warning on line 214 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L214 <511>

TODO: does not work for large numbers (fixme)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:214:5: W0511: TODO: does not work for large numbers (fixme)
return ttest_ind(baseline_pim_total, current_pim_total).pvalue < 0.05

Check failure on line 215 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L215

error: Returning Any from function declared to return "bool" [no-any-return]
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:215:5: error: Returning Any from function declared to return "bool"  [no-any-return]


def pim_regression_check(
baseline_pim: tp.DefaultDict[str, tp.List[int]],
current_pim: tp.DefaultDict[str, tp.List[int]]
) -> bool:
"""Compares two pims and determines if there was a regression between the
baseline and current."""
# return cliffs_delta_pim_regression_check(baseline_pim, current_pim)
return precise_pim_regression_check(baseline_pim, current_pim)


class Profiler():
"""Profiler interface to add different profilers to the evaluation."""

Expand Down Expand Up @@ -176,8 +278,6 @@ def is_regression(
self, report_path: ReportFilepath, patch_name: str
) -> bool:
"""Checks if there was a regression between the old an new data."""
is_regression = False

multi_report = fpp.MultiPatchReport(

Check failure on line 281 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L281

error: Module "varats.experiments.vara.feature_perf_precision" does not explicitly export attribute "MultiPatchReport" [attr-defined]
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:281:24: error: Module "varats.experiments.vara.feature_perf_precision" does not explicitly export attribute "MultiPatchReport"  [attr-defined]
report_path.full_path(), TEFReportAggregate
)
Expand All @@ -198,27 +298,7 @@ def is_regression(
for feature, value in pim.items():
new_acc_pim[feature].append(value)

for feature, old_values in old_acc_pim.items():
if feature == "Base":
# The regression should be identified in actual feature code
continue

if feature in new_acc_pim:
new_values = new_acc_pim[feature]
ttest_res = ttest_ind(old_values, new_values)

# TODO: check, maybe we need a "very small value cut off"
if ttest_res.pvalue < 0.05:
# print(
# f"{self.name} found regression for feature {feature}."
# )
is_regression = True
else:
print(f"Could not find feature {feature} in new trace.")
# TODO: how to handle this?
is_regression = True

return is_regression
return pim_regression_check(old_acc_pim, new_acc_pim)


class PIMTracer(Profiler):
Expand Down Expand Up @@ -253,8 +333,6 @@ def is_regression(
self, report_path: ReportFilepath, patch_name: str
) -> bool:
"""Checks if there was a regression between the old an new data."""
is_regression = False

multi_report = fpp.MultiPatchReport(

Check failure on line 336 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L336

error: Module "varats.experiments.vara.feature_perf_precision" does not explicitly export attribute "MultiPatchReport" [attr-defined]
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:336:24: error: Module "varats.experiments.vara.feature_perf_precision" does not explicitly export attribute "MultiPatchReport"  [attr-defined]
report_path.full_path(), fpp.PerfInfluenceTraceReportAggregate

Check failure on line 337 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L337

error: Module "varats.experiments.vara.feature_perf_precision" does not explicitly export attribute "PerfInfluenceTraceReportAggregate" [attr-defined]
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:337:38: error: Module "varats.experiments.vara.feature_perf_precision" does not explicitly export attribute "PerfInfluenceTraceReportAggregate"  [attr-defined]
)
Expand All @@ -274,28 +352,7 @@ def is_regression(
print(e)
return False

# TODO: same for TEF
for feature, old_values in old_acc_pim.items():
if feature in new_acc_pim:
if feature == "Base":
# The regression should be identified in actual feature code
continue

new_values = new_acc_pim[feature]
ttest_res = ttest_ind(old_values, new_values)

# TODO: check, maybe we need a "very small value cut off"
if ttest_res.pvalue < 0.05:
# print(
# f"{self.name} found regression for feature {feature}."
# )
is_regression = True
else:
print(f"Could not find feature {feature} in new trace.")
# TODO: how to handle this?
is_regression = True

return is_regression
return pim_regression_check(old_acc_pim, new_acc_pim)


class EbpfTraceTEF(Profiler):
Expand All @@ -311,8 +368,6 @@ def is_regression(
self, report_path: ReportFilepath, patch_name: str
) -> bool:
"""Checks if there was a regression between the old an new data."""
is_regression = False

multi_report = fpp.MultiPatchReport(

Check failure on line 371 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / mypy

[mypy] varats/varats/data/databases/feature_perf_precision_database.py#L371

error: Module "varats.experiments.vara.feature_perf_precision" does not explicitly export attribute "MultiPatchReport" [attr-defined]
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:371:24: error: Module "varats.experiments.vara.feature_perf_precision" does not explicitly export attribute "MultiPatchReport"  [attr-defined]
report_path.full_path(), TEFReportAggregate
)
Expand All @@ -333,27 +388,7 @@ def is_regression(
for feature, value in pim.items():
new_acc_pim[feature].append(value)

for feature, old_values in old_acc_pim.items():
if feature == "Base":
# The regression should be identified in actual feature code
continue

if feature in new_acc_pim:
new_values = new_acc_pim[feature]
ttest_res = ttest_ind(old_values, new_values)

# TODO: check, maybe we need a "very small value cut off"
if ttest_res.pvalue < 0.05:
# print(
# f"{self.name} found regression for feature {feature}."
# )
is_regression = True
else:
print(f"Could not find feature {feature} in new trace.")
# TODO: how to handle this?
is_regression = True

return is_regression
return pim_regression_check(old_acc_pim, new_acc_pim)


def get_patch_names(case_study: CaseStudy) -> tp.List[str]:

Check failure on line 394 in varats/varats/data/databases/feature_perf_precision_database.py

View workflow job for this annotation

GitHub Actions / pylint

[pylint] varats/varats/data/databases/feature_perf_precision_database.py#L394 <116>

Missing function or method docstring (missing-function-docstring)
Raw output
varats/varats/data/databases/feature_perf_precision_database.py:394:0: C0116: Missing function or method docstring (missing-function-docstring)
Expand Down

0 comments on commit 3a6c5d3

Please sign in to comment.