Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bachlorthesis Friedel #659

Draft
wants to merge 216 commits into
base: vara-dev
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
216 commits
Select commit Hold shift + click to select a range
3398096
Merge vara-dev
Sinerum Jan 6, 2022
e8ebc4a
merge vara-dev
Sinerum Jan 14, 2022
96ea94a
add lines of code database and add plot comparing interactions and li…
Sinerum Feb 1, 2022
8bfedfe
make the interaction line comparison plot nicer
Sinerum Feb 7, 2022
d680703
Add plots comparing author contribution of lines and interactions
Sinerum Feb 15, 2022
b1aa3a9
Add commit evolution plot
Sinerum Feb 15, 2022
8e0593b
Add calc missing revisions for most of the new plots
Sinerum Feb 17, 2022
da4d951
fix format of commit-structure plot
Sinerum Feb 18, 2022
f13a65a
turn commit_structure plot into a scatter plot
Sinerum Mar 31, 2022
d08900c
color base_hashes acording to their author
Sinerum Apr 5, 2022
0e56b14
add legend for author colors
Sinerum Apr 6, 2022
99c178a
Clean up plots
Sinerum Apr 11, 2022
1d5416b
remove the restriction of the case_study from the data and only use i…
Sinerum Apr 27, 2022
63e2033
Pull in new BenchBuild version 6.3 (#597)
vulder Feb 4, 2022
9f45cdf
Implements a paper config specific git filter (#596)
vulder Feb 4, 2022
0dcd11c
Adds perf_tests into bb_config generation (#599)
vulder Feb 16, 2022
8331a0a
Fixes typos in documentation and adds required packages for arch and …
jonas-kaufmann Feb 16, 2022
cb82ab8
Implements code only revision sampling (#601)
vulder Feb 21, 2022
f9a9c23
Allow to build research tools in container. (#595)
boehmseb Feb 28, 2022
f05bed2
Implements revisions selection based on a time range (#602)
vulder Feb 28, 2022
79c3079
Add possibility to attach converters to CLI options. (#605)
boehmseb Mar 1, 2022
918fc07
Implement vara-cs cleanup all (#603)
Sinerum Mar 1, 2022
28c871d
Fix type error in `vara-cs package`. (#606)
boehmseb Mar 1, 2022
d4b2315
Fixes timestamp based revision sampling (#607)
vulder Mar 2, 2022
ef42cf9
Ignore blocked revisions by default (#608)
vulder Mar 2, 2022
022111a
Plot rework for the verifier-opt-plot (#527)
cormensratio Mar 2, 2022
2d9a654
Plot rework for the verifier-no-opt-plot (#526)
cormensratio Mar 2, 2022
a5cd4eb
Adds utility for automatic zipped report folders (#609)
vulder Mar 2, 2022
96208a4
Gracefully exit vara-run should benchbuild fail (#615)
vulder Mar 16, 2022
827e029
Adds blame interaction graphs (#452)
boehmseb Mar 16, 2022
4e90441
SZZ quality metric database + table (#477)
boehmseb Mar 16, 2022
125a5ef
Reintroduces verifier-opt-plot generator (#613)
cormensratio Mar 16, 2022
0e6ef5d
Fixes vara-cs view; using the user selected report type (#616)
vulder Mar 17, 2022
d20eb13
Bump version to 11.1.2
vulder Mar 17, 2022
aad09db
Bump version to 11.1.3
vulder Mar 17, 2022
1dc2fa1
Extends release docs with more details (#617)
vulder Mar 19, 2022
4e318cf
Refactor the constructor of the Plot class (#620)
Sinerum Mar 19, 2022
f992d50
Ports old format code to f-strings (#619)
Sinerum Mar 19, 2022
fc775ab
Remove unused imports (#624)
LuAbelt Mar 20, 2022
1fe8cb8
Additional verification for VaRA in buildsetup (#621)
LuAbelt Mar 20, 2022
1d28414
Avoid unnecessary git fetch and clone calls. (#622)
boehmseb Mar 20, 2022
84d755e
Removed check_required_args (#623)
LuAbelt Mar 20, 2022
9df6bd2
Implements concurrent multi report loading function (#625)
vulder Mar 20, 2022
5fcafdc
Added option to bb config generation to include test_projects (#627)
LuAbelt Mar 20, 2022
7cefce6
Changed old se-passau github links to new se-sic (#628)
LuAbelt Mar 20, 2022
aa3c326
Fixes multiple mypy issues (#626)
vulder Mar 21, 2022
9422d43
Pull project repository before generating CaseStudy (#604)
Sinerum Mar 21, 2022
6073dff
Implements basic experiment to measure white-box feature performance …
vulder Mar 23, 2022
db960f9
Report aggregates for time and TEF (#631)
jonas-kaufmann Apr 8, 2022
4fbbdaa
Improves varats/VaRA documentation (#635)
vulder Apr 13, 2022
f721d1c
Fix vara container install check to not run vara outside container (#…
boehmseb Apr 19, 2022
242ea80
Set mounting parameters according to the Buildah version when buildi…
Sinerum Apr 20, 2022
ce2fdc8
Introduces better VaRA documentation (#636)
vulder Apr 27, 2022
d37daeb
Create seperate Module for basic git command wrapper (#629)
Sinerum Apr 27, 2022
c382809
Pin crypto lib to older version to prevent import cycle (#637)
vulder Apr 27, 2022
faf0316
Rework table architecture (#618)
cormensratio Apr 28, 2022
74b5230
Adapt plots to changes in the api
Sinerum May 9, 2022
b93c49f
make ylables ShortCommit hashes
Sinerum May 11, 2022
176d9fd
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum May 11, 2022
e179a4e
add more customize ability to my plots
Sinerum May 17, 2022
3551433
Adds more documentation for VaRA (#641)
vulder May 13, 2022
2c81653
Changes contains_source_code to use git show (#639)
Sinerum May 16, 2022
98905e6
Remove extra_test.sh from ci (#638)
Sinerum May 17, 2022
54eec84
USDT execution stats experiment, projects gzip and bzip2 (#642)
jonas-kaufmann May 17, 2022
902f221
Merge vara-dev
Sinerum Jan 6, 2022
e1718a3
merge vara-dev
Sinerum Jan 14, 2022
d27d535
Adds perf_tests into bb_config generation (#599)
vulder Feb 16, 2022
b499a85
Implements code only revision sampling (#601)
vulder Feb 21, 2022
5fec43c
Adds blame interaction graphs (#452)
boehmseb Mar 16, 2022
5b60716
Additional verification for VaRA in buildsetup (#621)
LuAbelt Mar 20, 2022
6b9a05c
Pull project repository before generating CaseStudy (#604)
Sinerum Mar 21, 2022
e3bd106
Create seperate Module for basic git command wrapper (#629)
Sinerum Apr 27, 2022
b2eaaeb
Adapt plots to changes in the api
Sinerum May 9, 2022
1bb8b26
make author coloration more clear
Sinerum May 23, 2022
dbadab1
revert changes to the calc surviving lines function made by merging
Sinerum May 23, 2022
f4aba1f
make the cells of the plot not square
Sinerum Jun 13, 2022
789a0aa
cleanup some of the plots
Sinerum Jun 22, 2022
962135a
convert most pandas operations to inplace to save memory
Sinerum Jun 28, 2022
1756470
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Jul 5, 2022
9429b06
rework commit_structure plot to show total loc and interactions at sa…
Sinerum Jul 7, 2022
a73a4ec
adapt survivng_lines_database to the new database type style
Sinerum Jul 7, 2022
5d7a571
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Jul 11, 2022
9c05929
Add function interaction graph
Sinerum Jul 15, 2022
91a1167
Bump version to 11.1.4
vulder Jul 13, 2022
7b49d37
Update projects.rst (#662)
bnico99 Jul 17, 2022
5587ce3
Adds table to compare `TimeReportAggregate` stats between multiple ex…
jonas-kaufmann Jul 17, 2022
b5ddf69
Table/Plot related fixes and improvements (#666)
boehmseb Aug 17, 2022
5215ddb
Pin new BB version (#670)
vulder Aug 23, 2022
6ed9efa
Fix docs build error caused by issues with the _typeshed module (#671)
boehmseb Aug 23, 2022
8a9376d
Bump version to 11.1.5
vulder Aug 23, 2022
f405d26
Use experiment type in addition to report type for result file select…
boehmseb Aug 25, 2022
91a807b
Draft config filename interface ideas (#669)
vulder Aug 26, 2022
f4b25ef
Extract author and commit counting functions to git_utils.py (#676)
boehmseb Sep 11, 2022
920cd76
Fixes click experiment selection (#674)
vulder Sep 11, 2022
1409928
Bump to develop LLVM-14 vara version (#678)
vulder Sep 11, 2022
0839eca
Fixes imports to enable doc building (#675)
vulder Sep 11, 2022
36f8d90
Adds vara/vara-llvm upgrade instructions into docs (#679)
vulder Sep 11, 2022
a689a4e
Fix VaRA upgrade version check (#677)
boehmseb Sep 11, 2022
2f52b2e
Support non-master branches (#672)
bnico99 Sep 15, 2022
a7b87d8
Makes pylintrc compatible with new pylint version (#680)
vulder Sep 22, 2022
b94b945
Adds workload support to varats (#667)
vulder Sep 29, 2022
bbf55a2
Fix typo in docs (#683)
danjujan Oct 10, 2022
e543f5e
Fix OpenSSL configure step for certain revisions. (#682)
boehmseb Oct 11, 2022
daa6c66
Add blame meta-data rewrite flags to blame report experiment. (#684)
boehmseb Oct 19, 2022
3292051
Removes old cryptography dep as the cycle is removed (#686)
vulder Oct 20, 2022
eebeaff
Adds fast_downward as new project (#663)
bnico99 Oct 20, 2022
af988d8
Add support for lightweight and annotated tags (#690)
bnico99 Oct 20, 2022
f596314
Implements error status tracking for zipped reports (#691)
vulder Oct 20, 2022
8460118
Add hypre project (#689)
bnico99 Oct 20, 2022
7bf608c
Fixes doc building errors due to cryptography cycles (#692)
vulder Oct 20, 2022
a394ccd
Add clasp project (#688)
bnico99 Oct 20, 2022
ca3cf03
Add z3 project (#687)
bnico99 Oct 20, 2022
6574410
Adds example workload for FPerfCSCollection (#685)
vulder Oct 20, 2022
b444165
Merge vara-dev
Sinerum Jan 6, 2022
4314ed1
merge vara-dev
Sinerum Jan 14, 2022
a687e6f
Implements basic experiment to measure white-box feature performance …
vulder Mar 23, 2022
d6e7e5f
Pin crypto lib to older version to prevent import cycle (#637)
vulder Apr 27, 2022
aaf7189
Merge vara-dev
Sinerum Jan 6, 2022
2bac5f6
merge vara-dev
Sinerum Jan 14, 2022
74bf09e
revert unwanted changes from merge
Oct 24, 2022
bdf8b2e
Fix plot config usage
Sinerum Oct 25, 2022
2cfb49a
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Nov 7, 2022
c4db13d
remove code that was accidentally added while merging
Sinerum Nov 7, 2022
4b4fda9
Update varats-core/varats/utils/git_util.py
Sinerum Nov 7, 2022
01adeb8
Update varats-core/varats/utils/git_util.py
Sinerum Nov 7, 2022
db0f0ca
remove old table and fix tick format for plot
Sinerum Feb 15, 2023
54fe851
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Mar 1, 2023
fab4b7a
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Mar 4, 2023
b132245
add type hints
Sinerum Mar 10, 2023
daedffc
add trendline plot
Sinerum Mar 10, 2023
2ebab9b
use ReportFilename for report_belongs to experiment and make blame_re…
Sinerum Mar 14, 2023
237fb8f
fix usage of ReportFilname in usage of file_belongs_to_experiment
Sinerum Mar 14, 2023
15b9e4c
separate commit survival and commit trend plots
Sinerum Mar 14, 2023
c566040
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Mar 15, 2023
17ede0b
cleanup
Sinerum Mar 15, 2023
33daedd
add author trend plot
Sinerum Mar 16, 2023
5953dc4
use new RepositoryAtCommit context handler in calc lines per commit
Sinerum Mar 16, 2023
1bb997f
use new RepositoryAtCommit context handler in calc lines per commit
Sinerum Mar 16, 2023
fe233b5
fix type error
Sinerum Mar 16, 2023
c716911
add proper commit_survival table
Mar 6, 2023
28cb440
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Mar 20, 2023
5838261
add non normalized commit trend and commit trend using the diff betw…
Mar 20, 2023
11e700a
add more trend plots
Sinerum Mar 22, 2023
97dc291
plot stand fosd
Sinerum Mar 26, 2023
1987f38
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Apr 3, 2023
5b5aa15
add commit_interaction_aggregate_database to aggregate interactions o…
Sinerum Apr 5, 2023
9dbe5de
add cleanup imports
Sinerum Apr 5, 2023
0b62a72
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Apr 27, 2023
fe3816b
show additional project information in cs gui
Sinerum Apr 27, 2023
7770cc4
add some docu to my data functions
Sinerum May 9, 2023
2357c5a
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum May 17, 2023
5d0027b
add revision based compile to libxml2 for versions without cmake
Sinerum May 22, 2023
57267f6
add container tio z3
Sinerum May 22, 2023
f3e55e5
add container to bzip2
Sinerum May 23, 2023
c106f39
add container to hypre
Sinerum May 23, 2023
d482d83
add container to git
Sinerum May 23, 2023
d8cd267
add dependencies to git container
Sinerum May 23, 2023
38f9af4
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum May 23, 2023
9a59f8a
fix tables with multi case study requirement
Sinerum May 23, 2023
cb74d80
add impact plots
Sinerum May 23, 2023
3e54306
add revision based compile functions and containers to bzip2
Sinerum May 24, 2023
1b7805c
fix revision base binaries for bzip2
Sinerum May 24, 2023
33cc34b
use type revision range for bzip revision based compilation
Sinerum May 24, 2023
808d12b
fix bzips revision based compile
Sinerum May 24, 2023
b31f71e
add asterisk project file
Jun 5, 2023
d95d4b7
add asterisk project to bb_config
Jun 5, 2023
2e699a0
add opencv project
Jun 5, 2023
adcf0bb
fix bb_conf entry for opencv
Jun 5, 2023
ad59dd9
fix docstring and unsude imports for opencv.py and asterisk.py
Jun 5, 2023
c0509ce
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Jun 7, 2023
570f9c6
fix opencv binary verification
Sinerum Jun 7, 2023
70bb894
add impact correlation table
Sinerum Jun 8, 2023
2170847
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Jun 14, 2023
8a62a95
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Jun 15, 2023
5b17993
add missing commits method to interaction scater plot
Sinerum Jun 19, 2023
5a9d88f
use new author uniquification for author plots
Sinerum Jun 19, 2023
82ad5a0
fix types in calc_surviving_lines
Jun 13, 2023
4965fc8
add author_interactions_database.py to pre calculate author interacti…
Jun 20, 2023
ab5a944
change contribution plots to a stacked plot
Sinerum Jun 22, 2023
b9d6ef7
add table for high impact revisions
Sinerum Jun 22, 2023
faf3ba0
use correct update function for name and mail_address sets to update …
Sinerum Jun 22, 2023
14eaab5
adapt to name cleanup of author.id
Sinerum Jun 22, 2023
95988f6
add sqlite as project
Sinerum Jun 23, 2023
0efe7ec
change author map generation to take project name6
Sinerum Jun 26, 2023
735c819
add RevisionImpact calc missing revisions
Sinerum Jun 26, 2023
69548e3
add RevisionImpact Distribution table
Sinerum Jun 26, 2023
fa4e4f0
adapt author map tests to new generation method
Sinerum Jun 26, 2023
c2781d0
adapt author plot and database to new author map generation
Sinerum Jun 26, 2023
de2d0be
generate plots for multiple case-studies at once6
Sinerum Jun 26, 2023
be7eada
fix calc surviving lines
Sinerum Jun 27, 2023
06e2abe
Merge branch 'vara-dev' into BachlorThesisFriedel
Sinerum Jun 28, 2023
2e5dd82
add case_study overview to gui
Sinerum Jun 30, 2023
3577f7c
make author_contribution_survival generate plots for multiple case st…
Sinerum Jun 30, 2023
92b3041
handle uncommited changes and unknown authors in author_interactions_…
Sinerum Jun 30, 2023
b1339fd
make is_experiment_excluded public
Sinerum Jun 30, 2023
f0b855c
combine interactions and lines per commit methods as both databases w…
Sinerum Jun 30, 2023
4d28f40
add hh as cpp file extension
Sinerum Jun 30, 2023
2cd1ca2
improve revision impact tables and plots
Sinerum Jun 30, 2023
8dea5ad
add option to disable individual kdes in scatter_plot_utils.py
Sinerum Jun 30, 2023
d8930b0
make single commit plot generate for multiple revisions
Sinerum Jun 30, 2023
d179d9e
add table for evolution of individual commits
Sinerum Jun 30, 2023
f522fd2
make auhtor contribution plot nice
Sinerum Jul 17, 2023
5b192d9
cleanup plots
Sinerum Jul 17, 2023
f7b2e7d
Cleanup plot lables
Jul 23, 2023
7d11b43
Cleanup author_contribution_survival.py
Aug 14, 2023
b5f82b2
Cleanup change map plots
Aug 14, 2023
26276ca
Remove tables
Aug 14, 2023
e7cf788
Cleanup revision_impact.py
Aug 14, 2023
e2b23d8
add simple test for calc_surviving lines
Sinerum Jul 26, 2023
1022b04
make cleanup author contribution plots
Sinerum Jul 26, 2023
6138f5d
make author interaction database compatible with old reports
Sinerum Jul 26, 2023
040c6d0
make revision for calc_surviving_lines optional use Head if None
Sinerum Jul 26, 2023
5249269
cleanup revision impact plots
Sinerum Jul 26, 2023
a41ed86
Cleanup interaction change distribution plots
Sinerum Jul 26, 2023
83d841b
add experiment success data tocs_metrics table
Sinerum Jul 26, 2023
ea69039
cleanup
Sinerum Jan 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions tests/utils/test_git_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
get_submodule_head,
calc_code_churn_range,
RepositoryAtCommit,
calc_surviving_lines,
)


Expand Down Expand Up @@ -226,6 +227,24 @@ def test_contains_source_code_with(self) -> None:
)
)

def test_calc_surviving_lines(self):
lines = calc_surviving_lines(
"MutliMethodAuthorCoordination",
FullCommitHash("f2f294bdda48526915b5a018e7e91f9f80204269")
)
self.assertEqual(
lines[FullCommitHash("28f1624bda75a0c2da961e2572f9eebc31998346")], 3
)
self.assertEqual(
lines[FullCommitHash("9209cff2d5b6cf9b7b39020b43081bd840347be2")], 4
)
self.assertEqual(
lines[FullCommitHash("ffb0fb502072846e081ac9f63f1eb86667197b95")], 3
)
self.assertEqual(
lines[FullCommitHash("f2f294bdda48526915b5a018e7e91f9f80204269")], 9
)


class TestChurnConfig(unittest.TestCase):
"""Test if ChurnConfig sets languages correctly."""
Expand Down
21 changes: 17 additions & 4 deletions uicomponents/CaseStudyGeneration.ui
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
<x>0</x>
<y>0</y>
<width>760</width>
<height>443</height>
<height>491</height>
</rect>
</property>
<property name="sizePolicy">
Expand Down Expand Up @@ -150,6 +150,19 @@
</property>
</widget>
</item>
<item>
<widget class="QComboBox" name="case_study"/>
</item>
<item>
<widget class="QComboBox" name="experiment"/>
</item>
<item>
<widget class="QCheckBox" name="cs_filter">
<property name="text">
<string>Filter CaseStudy´</string>
</property>
</widget>
</item>
<item>
<widget class="QLineEdit" name="commit_search">
<property name="placeholderText">
Expand Down Expand Up @@ -358,8 +371,8 @@
<string>&lt;!DOCTYPE HTML PUBLIC &quot;-//W3C//DTD HTML 4.0//EN&quot; &quot;http://www.w3.org/TR/REC-html40/strict.dtd&quot;&gt;
&lt;html&gt;&lt;head&gt;&lt;meta name=&quot;qrichtext&quot; content=&quot;1&quot; /&gt;&lt;style type=&quot;text/css&quot;&gt;
p, li { white-space: pre-wrap; }
&lt;/style&gt;&lt;/head&gt;&lt;body style=&quot; font-family:'Ubuntu'; font-size:11pt; font-weight:400; font-style:normal;&quot;&gt;
&lt;p style=&quot;-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;/body&gt;&lt;/html&gt;</string>
&lt;/style&gt;&lt;/head&gt;&lt;body style=&quot; font-family:'Noto Sans'; font-size:10pt; font-weight:400; font-style:normal;&quot;&gt;
&lt;p style=&quot;-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; font-family:'Ubuntu'; font-size:11pt;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;/body&gt;&lt;/html&gt;</string>
</property>
</widget>
</item>
Expand All @@ -374,7 +387,7 @@ p, li { white-space: pre-wrap; }
<x>0</x>
<y>0</y>
<width>760</width>
<height>22</height>
<height>34</height>
</rect>
</property>
</widget>
Expand Down
4 changes: 2 additions & 2 deletions varats-core/varats/experiment/experiment_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -366,7 +366,7 @@ def report_spec(cls) -> ReportSpecification:
return cls.REPORT_SPEC

@classmethod
def file_belongs_to_experiment(cls, file_name: str) -> bool:
def file_belongs_to_experiment(cls, file_name: ReportFilename) -> bool:
"""
Checks if the file belongs to this experiment.

Expand All @@ -377,7 +377,7 @@ def file_belongs_to_experiment(cls, file_name: str) -> bool:
True, if the file belongs to this experiment type
"""
try:
other_short_hand = ReportFilename(file_name).experiment_shorthand
other_short_hand = file_name.experiment_shorthand
return cls.shorthand() == other_short_hand
except ValueError:
return False
Expand Down
2 changes: 1 addition & 1 deletion varats-core/varats/mapping/author_map.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def __eq__(self, other) -> bool:
return False

def __str__(self) -> str:
return f"{self.name} <{self.mail}>"
return f"{self.name} {self.mail}"

def __repr__(self) -> str:
return f"{self.name} <{self.mail}>; {self.names},{self.mail_addresses}"
Expand Down
65 changes: 61 additions & 4 deletions varats-core/varats/utils/git_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -476,7 +476,7 @@ class Language(Enum):
value: tp.Set[str] # pylint: disable=invalid-name

C = {"h", "c"}
CPP = {"h", "hxx", "hpp", "cxx", "cpp", "cc"}
CPP = {"h", "hxx", "hpp", "cxx", "cpp", "cc", "hh"}

def __init__(self) -> None:
self.__enabled_languages: tp.List[ChurnConfig.Language] = []
Expand Down Expand Up @@ -1135,15 +1135,17 @@ def branch_has_upstream(
return tp.cast(bool, exit_code == 0)


class RepositoryAtCommit():
class RepositoryAtCommit:
"""Context manager to work with a repository at a specific revision, without
duplicating the repository."""

def __init__(self, project_name: str, revision: ShortCommitHash) -> None:
self.__repo = pygit2.Repository(
get_local_project_git_path(project_name)
)
self.__initial_head = self.__repo.head

self.__initial_head: pygit2.Reference = self.__repo.head
print(self.__initial_head.name)
self.__revision = self.__repo.get(revision.hash)

def __enter__(self) -> Path:
Expand All @@ -1155,4 +1157,59 @@ def __exit__(
exc_value: tp.Optional[BaseException],
exc_traceback: tp.Optional[TracebackType]
) -> None:
self.__repo.checkout(self.__initial_head)
self.__repo.checkout(
self.__initial_head, strategy=pygit2.GIT_CHECKOUT_FORCE
)


def calc_surviving_lines(project_name: str, revision: tp.Optional[FullCommitHash] = None) -> \
tp.Dict[FullCommitHash, int]:
"""
Sinerum marked this conversation as resolved.
Show resolved Hide resolved
Get the surviving lines of older commits at a given revision.

Args:
project_name: project to analyze
revision: revision to analyze at

returns: number of lines per prior commit
"""
churn_config = ChurnConfig.create_c_style_languages_config()
file_pattern = re.compile(
"|".join(churn_config.get_extensions_repr(r"^.*\.", r"$"))
)
if revision is not None:
hash = revision.hash
else:
hash = "HEAD"
lines_per_revision: tp.Dict[FullCommitHash, int] = {}
repo = pygit2.Repository(get_local_project_git_path(project_name))

initial_head: pygit2.Reference = repo.head
repo_folder = get_local_project_git_path(project_name)
git(__get_git_path_arg(repo_folder), "checkout", "-f", hash)
files = git(
__get_git_path_arg(repo_folder), "ls-tree", "-r", "--name-only", hash
).splitlines()

for file in files:
if file_pattern.match(file):
lines = git(
__get_git_path_arg(repo_folder), "blame", "--root", "-l",
f"{file}"
).splitlines()
for line in lines:
if line:
last_change = line[:FullCommitHash.hash_length()]
try:
last_change = FullCommitHash(last_change)
except ValueError:
continue

if lines_per_revision.keys().__contains__(last_change):
lines_per_revision[
last_change] = lines_per_revision[last_change] + 1
else:
lines_per_revision[last_change] = 1

git(__get_git_path_arg(repo_folder), "checkout", initial_head.name)
return lines_per_revision
156 changes: 156 additions & 0 deletions varats/varats/data/databases/author_interactions_database.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
import typing as tp

import pandas as pd

from varats.data.cache_helper import build_cached_report_table
from varats.data.databases.evaluationdatabase import EvaluationDatabase
from varats.data.reports.blame_report import (
gen_base_to_inter_commit_repo_pair_mapping,
)
from varats.experiments.vara.blame_report_experiment import (
BlameReportExperiment,
)
from varats.jupyterhelper.file import load_blame_report
from varats.mapping.author_map import Author, generate_author_map
from varats.mapping.commit_map import CommitMap
from varats.paper.case_study import CaseStudy
from varats.paper_mgmt.case_study import get_case_study_file_name_filter
from varats.project.project_util import (
get_local_project_git_path,
get_primary_project_source,
)
from varats.report.report import ReportFilepath
from varats.revision.revisions import (
get_processed_revisions_files,
get_failed_revisions_files,
)
from varats.utils.git_util import (
create_commit_lookup_helper,
UNCOMMITTED_COMMIT_HASH,
CommitRepoPair,
)


class AuthorInteractionsDatabase(
EvaluationDatabase,
cache_id="author_contribution_data_base",
column_types={
"author_name": 'str',
"author_mail": 'str',
"internal_interactions": 'int32',
"external_interactions": 'int32'
}
):
"""Provides access to internal and external interactions of authors."""

@classmethod
def _load_dataframe(
cls, project_name: str, commit_map: CommitMap,
case_study: tp.Optional[CaseStudy], **kwargs: tp.Dict[str, tp.Any]
) -> pd.DataFrame:

def create_dataframe_layout() -> pd.DataFrame:
df_layout = pd.DataFrame(columns=cls.COLUMNS)
df_layout = df_layout.astype(cls.COLUMN_TYPES)
return df_layout

def create_data_frame_for_report(
report_path: ReportFilepath
) -> tp.Tuple[pd.DataFrame, str, str]:
report = load_blame_report(report_path)
base_inter_c_repo_pair_mapping = \
gen_base_to_inter_commit_repo_pair_mapping(report)
revision = report.head_commit

def build_dataframe_row(
author: Author, internal_interactions: int,
external_interactions: int
) -> tp.Dict[str, tp.Any]:
data_dict: tp.Dict[str, tp.Any] = {
'revision': revision.hash,
'time_id': commit_map.short_time_id(revision),
'author_name': author.name,
'author_mail': author.mail,
'internal_interactions': internal_interactions,
'external_interactions': external_interactions
}
return data_dict

result_data_dicts: tp.Dict[Author, tp.Dict[str, tp.Any]] = {}
amap = generate_author_map(project_name)
repo_name = get_primary_project_source(project_name).local
commit_lookup_helper = create_commit_lookup_helper(project_name)
for base_pair in base_inter_c_repo_pair_mapping:
if not base_pair.commit.repository_name.startswith(repo_name):
# Skip interactions with submodules
continue
inter_pair_dict = base_inter_c_repo_pair_mapping[base_pair]
if base_pair.commit.commit_hash == UNCOMMITTED_COMMIT_HASH:
continue
base_commit = commit_lookup_helper(
CommitRepoPair(base_pair.commit.commit_hash, repo_name)
)
base_author = amap.get_author(
base_commit.author.name, base_commit.author.email
)
if base_author is None:
amap.add_entry(
base_commit.author.name, base_commit.author.email
)
base_author = amap.get_author(
base_commit.author.name, base_commit.author.email
)
internal_interactions = 0
external_interactions = 0
for inter_pair, interactions in inter_pair_dict.items():
if inter_pair.commit.commit_hash == UNCOMMITTED_COMMIT_HASH or not inter_pair.commit.repository_name.startswith(
repo_name
):
continue
inter_commit = commit_lookup_helper(
CommitRepoPair(
inter_pair.commit.commit_hash, repo_name
)
)
inter_author = amap.get_author(
inter_commit.author.name, inter_commit.author.email
)
if base_author == inter_author:
internal_interactions += interactions
else:
external_interactions += interactions
if base_author in result_data_dicts:
result_data_dicts[base_author]['internal_interactions'
] += internal_interactions
result_data_dicts[base_author]['external_interactions'
] += external_interactions
else:
result_data_dicts[base_author] = build_dataframe_row(
base_author, internal_interactions,
external_interactions
)

return pd.DataFrame(
list(result_data_dicts.values())
), report.head_commit.hash, str(report_path.stat().st_mtime_ns)

report_files = get_processed_revisions_files(
project_name,
BlameReportExperiment,
file_name_filter=get_case_study_file_name_filter(case_study)
)

failed_report_files = get_failed_revisions_files(
project_name,
BlameReportExperiment,
file_name_filter=get_case_study_file_name_filter(case_study)
)

data_frame = build_cached_report_table(
cls.CACHE_ID, project_name, report_files, failed_report_files,
create_dataframe_layout, create_data_frame_for_report,
lambda path: path.report_filename.commit_hash.hash,
lambda path: str(path.stat().st_mtime_ns),
lambda a, b: int(a) > int(b)
)
return data_frame
Loading