Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F feature blame plots #834

Draft
wants to merge 101 commits into
base: vara-dev
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
9722976
implement FeatureBlameReportExperiment
Apr 11, 2023
5d415a1
add test_repo to test projects
Apr 11, 2023
1405e8c
some nmae/type fixes
Apr 12, 2023
bb75269
correct binary?
Apr 12, 2023
44a4669
added print feature
Apr 13, 2023
fd99f6c
change config
Apr 13, 2023
11bf7ba
create bc with clang
Apr 13, 2023
5fe3a66
print out FBR
Apr 13, 2023
2603afc
remove unnecessary code
Apr 14, 2023
a1a9f4c
minor cleanup
Apr 14, 2023
edc3c4b
added new file
Apr 14, 2023
071b7c3
fixes for pre-commit hook
Apr 17, 2023
c52af3e
review changes
Apr 17, 2023
0deac4e
fixes by pre-commit
Apr 18, 2023
45f881f
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-G…
Apr 18, 2023
b3cad79
adapt code to work after pull from vara-dev
Apr 18, 2023
8a6f71e
adapted python fbr to new yaml fbr
May 22, 2023
2c56155
separated feature blame report
Jun 25, 2023
2b8e197
first implementations for plots of FeatureBlameReports
Jun 27, 2023
9d122fe
note
Jun 27, 2023
ca1240e
first plot
Jul 5, 2023
3d57365
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-F…
Jul 7, 2023
f3bf72b
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-G…
Jul 7, 2023
5e221e1
added detection of feature model
Jul 10, 2023
933f9ce
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-F…
Jul 10, 2023
873e512
Merge branch 'f-GenerateFeatureBlameReport' of github.com:se-sic/VaRA…
Jul 10, 2023
8fc18ef
added new dcfi plot
Jul 11, 2023
40dc326
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-F…
Jul 16, 2023
bfe3b47
new plots and table
Jul 17, 2023
a983b7a
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-F…
Jul 20, 2023
2883862
remove unused dependencies
Jul 24, 2023
88c869d
new data creation and plots for authors
Jul 24, 2023
4e73420
improve generation of author cfi data
Aug 21, 2023
d5c6fcc
implement table for structural report
Aug 21, 2023
46486ec
shorthand correction
Aug 25, 2023
912a4bc
remove unused import
Aug 25, 2023
28637bd
stacked author distribution plot for projects
Aug 25, 2023
8d89035
change occurences of 'scope' to 'size'
Aug 25, 2023
6303928
FeatureSizeCorrPlot displays multiple case studies
Aug 25, 2023
322ed95
FeatureSizeCorrDFBRPlot + its data generation
Aug 28, 2023
0f5d6ba
tables can now show data for several projects
Aug 28, 2023
ffd9d3e
new feature interaction dis plot several projects
Aug 29, 2023
de87215
add new function num_active_commits to git_util.py
Aug 29, 2023
d482879
num impl features dis of commits is now stacked for every project
Aug 30, 2023
1f1aee8
num impl features dis of commits is now stacked for every project
Aug 30, 2023
2575ff3
num features affect through dataflow of commits is now stacked for ev…
Aug 30, 2023
ecee990
Merge branch 'f-FeatureBlamePlots' of github.com:se-sic/VaRA-Tool-Sui…
Aug 30, 2023
2b423eb
add changes lost in merge
Aug 30, 2023
11a6511
minor changes
Sep 4, 2023
e387265
improve row access, add variance to tables
Sep 4, 2023
41a7e38
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-F…
Sep 4, 2023
f73afda
added new pie chart for both structural/df-based interactions of commits
Sep 4, 2023
bd16a88
some adjustments for pie charts
Sep 5, 2023
22d90a8
improvements for author plots and data generation
Sep 6, 2023
dafb1ba
refactoring
Sep 7, 2023
c5ee49a
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-G…
Sep 7, 2023
6c7e685
changes for checks
Sep 11, 2023
ab8bd91
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-F…
Sep 11, 2023
173d0c3
Merge branch 'vara-dev' of github.com:se-sic/VaRA-Tool-Suite into f-G…
Sep 11, 2023
c0abdde
added missing FeatureModelProvider to dataflow report
Sep 11, 2023
d542167
adapt test_bb_config for merge of experiments
Sep 11, 2023
9aad357
swap rows and columns for tables
Sep 14, 2023
526c6ff
only consider succesful report files
Sep 14, 2023
7280284
improve proportional df plot
Sep 14, 2023
519d96b
changes to review
Sep 14, 2023
4180986
can now get authors of submodules
Sep 15, 2023
60d101c
new proportional commit plots
Sep 15, 2023
da584b8
split sfbr-tables into three tables
Sep 18, 2023
f83e5b9
entries of rows are now rounded to two decimals
Sep 18, 2023
038234c
change computation of structural CFI to coincide with df-based CFI
Sep 19, 2023
4ff4e20
calc p-values for correlation coefficient as well
Sep 19, 2023
2f4d97c
adapt commit sfbr pie charts
Sep 20, 2023
dc5db5d
remove unused import
Sep 20, 2023
2f76768
implement new author dataflow analysis
Sep 25, 2023
42de982
adapt to changes in structural cfi collection
Sep 25, 2023
3d992e6
Merge branch 'f-GenerateFeatureBlameReport' of github.com:se-sic/VaRA…
Sep 25, 2023
1af8119
adapt generate_commit_scfi_data
Sep 25, 2023
e62aa45
work on sfbr-commit-eval-table
Sep 25, 2023
5848c18
fix mistake in return type
Sep 25, 2023
3e4cd69
Merge branch 'f-GenerateFeatureBlameReport' of github.com:se-sic/VaRA…
Sep 25, 2023
bcfe677
adapt data collection to changes in sfbr
Sep 26, 2023
3e3f7b9
fixes for sfbr-commit-eval-table
Sep 26, 2023
e9b3dc2
move util functions to bottom
Sep 27, 2023
2b34bdd
sort features for combined authors
Sep 27, 2023
f150746
new plot for structural interactions of commits
Sep 27, 2023
d85ed12
new struct author plot
Sep 27, 2023
5382403
improve structural feature data generation
Sep 28, 2023
7a283bd
plots and tables used for RQ1improved
Sep 28, 2023
19562ab
changes to dataflow plots
Sep 29, 2023
87131ce
refactoring for pre-commits
Sep 29, 2023
2b9571d
changes to structural plots/tables
Oct 3, 2023
fdc6c4f
changes to FeatureDFBRPlot
Oct 4, 2023
61a24d6
changes to commti dfbr plots and tables
Oct 4, 2023
539236a
changes to sfbr plots
Oct 5, 2023
de75092
new author plot
Oct 5, 2023
a64121a
changes to dfbr plots
Oct 9, 2023
8ca9088
changes to feature plots
Oct 12, 2023
24a4f71
show commit-sfbr in histplot
Oct 12, 2023
8047073
add plot type to feature-sfbr plot
Oct 13, 2023
1a4bcac
small changes to plots
Oct 17, 2023
b2485fe
correctly count number of active lines for commits
Oct 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
added new pie chart for both structural/df-based interactions of commits
removed unsatisfying commit dis plots
Simon Rüdiger Steuer committed Sep 4, 2023
commit f73afda66f31fc76c2a68c7121d1632b6ac2f8e9
207 changes: 123 additions & 84 deletions varats/varats/plots/feature_blame_plots.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import typing as tp

import matplotlib.pyplot as pyplot
import numpy as np
import pandas as pd
import seaborn as sns

@@ -188,66 +190,71 @@ def generate(self) -> tp.List[Plot]:
]


def get_stacked_commit_data_for_case_studies(
case_studies: tp.List[CaseStudy], projects_data
) -> pd.DataFrame:
min_num_interacting_features = min([
min(project_data) for project_data in projects_data
])
max_num_interacting_features = max([
max(project_data) for project_data in projects_data
])
def get_pie_data_for_commit_data(commit_data) -> (tp.List[int], tp.List[int]):
min_num_interacting_features = min(commit_data)
max_num_interacting_features = max(commit_data)

rows = [
[i] + [0 for i in range(0, len(case_studies))] for i in
data = [
0 for _ in
range(min_num_interacting_features, max_num_interacting_features + 1)
]
add_s = lambda x: "" if x == 1 else "s"
labels = [
"Interacting with " + str(i) + " feature" + add_s(i) for i in
range(min_num_interacting_features, max_num_interacting_features + 1)
]

for project_data, index in zip(
projects_data, range(1,
len(case_studies) + 1)
for num_interacting_features in commit_data:
data[num_interacting_features - min_num_interacting_features] = data[
num_interacting_features - min_num_interacting_features] + 1

for i in range(
min_num_interacting_features, max_num_interacting_features + 1
):
for num_interacting_features in project_data:
rows[num_interacting_features - min_num_interacting_features
][index] = rows[num_interacting_features -
min_num_interacting_features][index] + 1
frac = data[i - min_num_interacting_features] / len(commit_data)
if frac < 0.1:
actual_num_interacting_features = i + min_num_interacting_features
labels[i] = "Interacting with >=" + str(
actual_num_interacting_features
) + " feature" + add_s(actual_num_interacting_features)
labels = labels[:i + 1]
data[i] = np.sum(data[i:])
data = data[:i + 1]
break

return pd.DataFrame(
rows,
columns=['Number of Interacting Features'] +
[case_study.project_name for case_study in case_studies]
)
return (data, labels)


class CommitDisSFBRPlot(Plot, plot_name="commit_dis_sfbr_plot"):
class CommitSFBRPieChart(Plot, plot_name="commit_sfbr_pie_chart"):

def plot(self, view_mode: bool) -> None:
case_studies: tp.List[CaseStudy] = self.plot_kwargs["case_studies"]
case_study: CaseStudy = self.plot_kwargs["case_study"]

projects_data = [
get_structural_commit_data_for_case_study(case_study).
loc[:, "num_interacting_features"] for case_study in case_studies
]
data = get_stacked_commit_data_for_case_studies(
case_studies, projects_data
)
print(data)
data.set_index('Number of Interacting Features'
).plot(kind='bar', stacked=True)
commit_data = get_structural_commit_data_for_case_study(
case_study
).loc[:, "num_interacting_features"]
data, labels = get_pie_data_for_commit_data(commit_data)

def func(pct):
absolute = int(np.round(pct / 100. * len(commit_data)))
return f"{absolute:d}"

class CommitDisSFBRPlotGenerator(
fig, ax = pyplot.subplots()
ax.pie(data, labels=labels, autopct=lambda pct: func(pct))


class CommitSFBRPieChartGenerator(
PlotGenerator,
generator_name="commit-dis-sfbr-plot",
generator_name="commit-sfbr-pie-chart",
options=[REQUIRE_MULTI_CASE_STUDY]
):

def generate(self) -> tp.List[Plot]:
case_studies: tp.List[CaseStudy] = self.plot_kwargs.pop("case_study")
return [
CommitDisSFBRPlot(
self.plot_config, case_studies=case_studies, **self.plot_kwargs
)
CommitSFBRPieChart(
self.plot_config, case_study=case_study, **self.plot_kwargs
) for case_study in case_studies
]


@@ -309,78 +316,110 @@ def get_commit_dataflow_data_for_case_study(
return data_frame


class FeatureInwardDataflowPlot(Plot, plot_name="feature_inward_dataflow_plot"):
class CommitDFBRPieChart(Plot, plot_name="commit_dfbr_pie_chart"):

def plot(self, view_mode: bool) -> None:
case_studies: tp.List[CaseStudy] = self.plot_kwargs["case_studies"]
projects_data = [
get_commit_dataflow_data_for_case_study(case_study)
for case_study in case_studies
]
for i in range(0, len(projects_data)):
projects_data[i] = projects_data[i].loc[projects_data[i]
["part_of_feature"] == 0]
projects_data[i] = projects_data[i].loc[:,
"num_interacting_features"]

data = get_stacked_commit_data_for_case_studies(
case_studies, projects_data
)
print(data)
data.set_index('Number of Interacting Features'
).plot(kind='bar', stacked=True)
case_study: CaseStudy = self.plot_kwargs["case_study"]
commit_data = get_commit_dataflow_data_for_case_study(case_study)
commit_data = commit_data.loc[commit_data["part_of_feature"] == 0]
# commit_data = commit_data.loc[
# commit_data["num_interacting_features"] > 0]
commit_data = commit_data.loc[:, 'num_interacting_features']

data, labels = get_pie_data_for_commit_data(commit_data)

def func(pct):
absolute = int(np.round(pct / 100. * len(commit_data)))
return f"{absolute:d}"

class FeatureInwardDataflowPlotGenerator(
fig, ax = pyplot.subplots()
ax.pie(data, labels=labels, autopct=lambda pct: func(pct))


class CommitDFBRPieChartGenerator(
PlotGenerator,
generator_name="feature-inward-dataflow-plot",
generator_name="commit-dfbr-pie-chart",
options=[REQUIRE_MULTI_CASE_STUDY]
):

def generate(self) -> tp.List[Plot]:
case_studies: tp.List[CaseStudy] = self.plot_kwargs.pop("case_study")
return [
FeatureInwardDataflowPlot(
self.plot_config, case_studies=case_studies, **self.plot_kwargs
)
CommitDFBRPieChart(
self.plot_config, case_study=case_study, **self.plot_kwargs
) for case_study in case_studies
]


class FeatureExistingInwardDataflowPlot(
Plot, plot_name="feature_existing_inward_dataflow_plot"
def get_stacked_proportional_feature_dataflow_data(
case_studies: tp.List[CaseStudy]
) -> pd.DataFrame:
rows = []
for case_study in case_studies:
data_commits = get_commit_dataflow_data_for_case_study(case_study)
number_commits = len(data_commits)

commits_part_of_feature = data_commits.loc[
data_commits['part_of_feature'] == 1]
fraction_interacting_commits_part_of_feature = len(
commits_part_of_feature.loc[
commits_part_of_feature['num_interacting_features'] > 0]
) / number_commits

commits_not_part_of_feature = data_commits.loc[
data_commits['part_of_feature'] == 0]
fraction_interacting_commits_not_part_of_feature = len(
commits_not_part_of_feature.loc[
commits_not_part_of_feature['num_interacting_features'] > 0]
) / number_commits

fraction_not_interacting_commits = len(
data_commits.loc[data_commits['num_interacting_features'] == 0]
) / number_commits

rows.append([
case_study.project_name,
fraction_interacting_commits_part_of_feature,
fraction_interacting_commits_not_part_of_feature,
fraction_not_interacting_commits
])

return pd.DataFrame(
data=rows,
columns=[
"Projects", ">= 1 Interaction Not Part of Feature",
">= 1 Interaction Part of Feature", "=0 Interactions"
]
)


class FeatureProportionalDataflowPlot(
Plot, plot_name="feature_proportional_dataflow_plot"
):

def plot(self, view_mode: bool) -> None:
case_studies: tp.List[CaseStudy] = self.plot_kwargs["case_studies"]
projects_data = [
get_commit_dataflow_data_for_case_study(case_study)
for case_study in case_studies
]
for i in range(0, len(projects_data)):
projects_data[i] = projects_data[i].loc[projects_data[i]
["part_of_feature"] == 0]
projects_data[i] = projects_data[i].loc[
projects_data[i]["num_interacting_features"] > 0]
projects_data[i] = projects_data[i].loc[:,
"num_interacting_features"]
data = get_stacked_commit_data_for_case_studies(
case_studies, projects_data
)
data = get_stacked_proportional_feature_dataflow_data(case_studies)
data = data.sort_values(by=['>= 1 Interaction Not Part of Feature'])
print(data)
data.set_index('Number of Interacting Features'
).plot(kind='bar', stacked=True)
plt = data.set_index('Projects').plot(
kind='bar', stacked=True, ylabel="Proportional Commit Count"
)
plt.legend(
title="Commit Kind", loc='center left', bbox_to_anchor=(1, 0.5)
)


class FeatureExistingInwardDataflowPlotGenerator(
class FeatureProportionalDataflowPlotGenerator(
PlotGenerator,
generator_name="feature-existing-inward-dataflow-plot",
generator_name="feature-proportional-dataflow-plot",
options=[REQUIRE_MULTI_CASE_STUDY]
):

def generate(self) -> tp.List[Plot]:
case_studies: tp.List[CaseStudy] = self.plot_kwargs.pop("case_study")
return [
FeatureExistingInwardDataflowPlot(
FeatureProportionalDataflowPlot(
self.plot_config, case_studies=case_studies, **self.plot_kwargs
)
]