-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Version 3.5 #168
Merged
Version 3.5 #168
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The shape of the legend for 'Vertices' was the same as the shape of the 'Author' vertex type. This is a little bit confusing when we have two different vertex types and vertices in the network: Then the shape of 'Vertices' in the legend is the same as for the vertex type 'Author' even if the vertex type is 'Artifact'. To avoid confusion, use another shape in the legend for 'Vertices' which is (usually) not used for vertex types. Then it is more clear that not the shape of 'Vertices' in the legend does matter, but the color. Signed-off-by: Thomas Bock <[email protected]>
Signed-off-by: Thomas Bock <[email protected]>
Update the exemplary multi network, which is displayed in the README.md, to contain the shape changes in the legend (see f4fb480). In addition, add width and height parameters to the ggsave statement in the showcase.R file which generates this examplary multi network. Signed-off-by: Thomas Bock <[email protected]>
Change shape of `Vertices` in the legend of plots to avoid confusion Reviewed-by: Claus Hunsen <[email protected]>
When calling the method 'ProjectData$reset.environment()', an error is produced: > Error in private$artifacts = NULL : > cannot add bindings to a locked environment This is due a regression introduced in commit 1bed431, where the field 'ProjectData$artifacts' has been removed, but not the corresponding statement for resetting it. This is fixed now. Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
Quick fix: Fix error when resetting an ProjectData environment Reviewed-by: Thomas Bock <[email protected]>
The get.commits.raw function was removed. Instead, the function get.commits should be used from now on. Signed-off-by: Jakob Kronawitter <[email protected]>
The artifact kind filtering which filters the commits.list file and only keeps the commits which have the correct artifact.type (configured in the ProjectConf class) has been moved to the get.commits method of the ProjectData class. Previously this functionality was in the get.commits.filtered method. Signed-off-by: Jakob Kronawitter <[email protected]>
In the case of a valid commits.list file with at least one commit line the read.commits function returns a data.frame with 16 columns containing all the commits read from the file. If the commits.list file is empty, however, it previously returned an empty data.frame with no columns. This has now been adjusted to return an empty data.frame with all the columns (16 columns), which should save a lot of additional if-else case distinctions later on because now the shape of the returned data.frame by the read.commits function is always the same. Signed-off-by: Jakob Kronawitter <[email protected]>
This major commit merges the two old methods get.commits.filtered and get.commits.filtered.empty of the ProjectData class into one new method again called get.commits.filtered. Similiarly, the filter.commits.empty and filter.commits methods were merged into one new filter.commits method which now takes filter.untracked.files and artifact.filter.base as paramaters which then control how the filtering is performed. The filter.untracked.files parameter was added to the ProjectConf which now controls - just like the artifact.filter.base parameter - which commits should be filtered out when calling the get.commits.filtered method. If you want to call the get.commits.filtered with other paramaters (not the ones that are configured in the ProjectConf) then one can call the get.commits.filtered.uncached version of this method. As the name implies, this method is not taking advantage of caching and should thus not be used too often. In the course of revamping these methods it only took a minor effort to rename the empty artifact to a more speaking identifier, namely, "untracked files". Thus, this renaming was also performed in this commit. Signed-off-by: Jakob Kronawitter <[email protected]>
Signed-off-by: Jakob Kronawitter <[email protected]>
The new get.commits method includes filtering by artifact kind. Two testcases depended on this and thus have now been adjusted accordingly. 10 test cases of the test-split.R are still not working. Signed-off-by: Jakob Kronawitter <[email protected]>
The test cases were adapted to two of the new changes in the network library. The first one is the fact that the get.commits method now removes either 'Feature' or 'FeatureExpression' commits. The second one was the change that there are no dummy data.frames anymore (with zero columns and rows). Instead there are empty data.frames when there no data exists (with columns but zero rows). One mistake was made during creation of these. The empty data.frames that are created did not contain any data type informtion (all columns defaulted to the 'logical' data.type). If this is not wanted there now exists a new helper method which also takes care of data types. Signed-off-by: Jakob Kronawitter <[email protected]>
Previously, when an author network was created and the untracked files artifact and the base artifact were included, edges have been created among the untracked files artifact and among the base artifact. This was now changed so that there are no edges created among untracked files at any time. For the base artifact it can be configured via the new base.artifact.edges parameter in the NetworkConf. Signed-off-by: Jakob Kronawitter <[email protected]>
Signed-off-by: Jakob Kronawitter <[email protected]>
Signed-off-by: Jakob Kronawitter <[email protected]>
The global constant 'UNTRACKED.FILE' is added to avoid reusage of the same string 'untracked.file' all the time. In addition minor adjustments are made to the documentation. Signed-off-by: Jakob Kronawitter <[email protected]>
In recent scenarios and in perspective of up-coming changes, the default behavior of 'Conf' objects upon initialization and update: 1) The default values are *not* automatically checked against the allowed values anymore. This is mainly disabled to avoid confusion of users. The constructor of the class 'Conf' is adapted accordingly. 2) When updating a configuration value, the program execution is now stopped (using 'stop') on failure. Previously, the respective update has been ignored while issuing a warning. This change helps preventing confusion and analysis errors early in an analysis script. Accordingly, the parameter 'stop.on.error' to all update methods is removed. Furthermore, the code is streamlined, such that the super-constructor is called from both subclasses 'NetworkConf' and 'ProjectConf'. Some log statements are added/adjusted, too. Signed-off-by: Claus Hunsen <[email protected]>
When a network contains no edges but more than one node, all the nodes get combined. To fix this, the respecting data frame, which contains the nodes, has to be transposed. This fixes #150. Reported-by: Jakob Kronawitter <[email protected]> Signed-off-by: Thomas Bock <[email protected]>
The edge creation process which does not draw any edges among authors of untracked files and - if configured in the 'ProjectConf' - does also not draw any edges among the base artifact authors is being reworked since the old way of achieving this was rather uninituitive and complicated. Signed-off-by: Jakob Kronawitter <[email protected]>
For commits to untracked files the artifact column has previously been the copied file column (for example when looking at the commit data returned by 'get.commits'). However this should only be the case when considering file level analysis (e.g. 'artifact == file' in the 'ProjectConf'). This commit changes this to the correct behaviour. So for 'artifact == function' and 'artifact == feature' the artifact column now only contains the empty string for untracked files. To avoid hardcoding this empty string in every affected place a global constant called 'UNTRACKED.FILE.EMPTY.ARTIFACT' was added. Signed-off-by: Jakob Kronawitter <[email protected]>
In a previous commit the constant 'UNTRACKED.FILE' was removed from the 'BASE.ARTIFACTS' constant due to temporary difficulties with this assignment. This change is now reverted. Signed-off-by: Jakob Kronawitter <[email protected]>
This commit changes an inline comment which was misleadingly talking about committers but actually meant commit authors. Signed-off-by: Jakob Kronawitter <[email protected]>
This commit renames the following three configuration options: - 'artifact.filter.base' to 'commits.filter.base.artifact', - 'filter.untracked.files' to 'commits.filter.untracked.files' - 'base.artifact.edges' to 'edges.for.base.artifacts'. Also the documentation gets slightly adjusted in one place because the old documentation contained outdated information. Signed-off-by: Jakob Kronawitter <[email protected]>
When constructing a network in 'construct.network.from.edge.list', several corner cases need to be handled. When there are no edges available, an empty edge list can be created using 'create.empty.edge.list'. This way, reliably, the function 'igraph::graph.data.frame' can be used to construct a network. This further improves the patch 0d7c222, which tackles #150. Tests for creating edgeless networks are added to the file 'tests/test-networks.R'. This likely prevents regressions in the future. Additionally, use the function 'create.empty.edge.list' in one further place where possible. Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Jakob Kronawitter <[email protected]>
This patch consists of three related fix and adaptations: First, the method 'ProjectData$get.authors.by.data.source' does not correct the column names of the returned data.frame anymore. This establishes compatibility with the method 'ProjectData$get.authors'. Additionally, the returned data.frame only contains unique entries. The documentation is tidied. Second, the method 'NetworkBuilder$get.author.network.cochange' is fixed by adding the missing 'private$' prefix when accessing the project data. Third, the assignment of author vertices is corrected to use only author names with the correct vertex attribute (i.e., "name"). This adapts the code with respect to the first change mentioned above. This change fixes all failing tests in PR #149. Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Jakob Kronawitter <[email protected]>
Signed-off-by: Jakob Kronawitter <[email protected]>
Signed-off-by: Klara Schlueter <[email protected]>
Add spaces between "if" and "(", add documentation for default values, clarify example for finding minima in first activity computation. Signed-off-by: Klara Schlueter <[email protected]>
Apply documentation conventions and give input-output example for list.by.inner.level. Signed-off-by: Klara Schlueter <[email protected]>
..and remove unused parameter from helper function. Signed-off-by: Klara Schlueter <[email protected]>
Signed-off-by: Klara Schlueter <[email protected]>
Signed-off-by: Klara Schlueter <[email protected]>
Signed-off-by: Klara Schlueter <[email protected]>
Base active ranges computation on multiple data sources and adapt first activity Reviewed-by: Claus Hunsen <[email protected]> Reviewed-by: Thomas Bock <[email protected]>
This fixes #10. Yeah, the oldest open issue will be closed! :) To guide users through the renaming of submodules, an additional note is added to the README. Signed-off-by: Claus Hunsen <[email protected]>
This fixes #157. Additionally, add a note on proper setting of comments. Signed-off-by: Claus Hunsen <[email protected]>
To fulfill the R coding conventions, class documentation for the following classes is added: - 'Conf', - 'ProjectConf', - 'NetworkConf', - 'ProjectData', - 'RangeData', and - 'NetworkBuilder'. Signed-off-by: Claus Hunsen <[email protected]>
For more consistency and coherence, the definitions and functions in the file 'util-read.R' are re-ordered to give rise to the sections 'Main data sources' and 'Additional data sources'. Each section contains subsections with the corresponding functions and constants for the single data sources. Signed-off-by: Claus Hunsen <[email protected]>
Fix crash behaviour of function 'get.author.class' which occurred whenever a zero-row dataframe was passed in the parameter 'author.data.frame'. The fix is realized without a parameter check but instead with two slight manipulations to the existing code. - The expression '1:author.class.threshold.idx' is replaced with 'seq_len(author.class.threshold.idx)' to always produce a integer vector of length 'author.class.threshold.idx', specifically in the case of 'author.class.threshold.idx' being zero which occurrs when a zero-row dataframe is passed through the parameter 'author.data.frame'. - The function 'sapply' shows a strange behaviour whenever an empty vector is passed as the first argument (in this case 'author.cumsum'). It always returns vectors having the same length as the first argument, however, when the first argument is a vector of length zero, it returns an empty list instead of an empty vector. Therefore, an 'as.logical' statement is added to ensure that there is always a (logical) vector being returned. The two above mentioned changes allow the function to handle zero-row dataframes being passed correctly without additional parameter checks. In addition, a call to 'suppressWarnings' is used to hide the warning that was output when the function 'min' gets called on an empty vector. The warning informed about 'min' returning an infinity value since no minimum value could be found in the empty vector, however, this special case is handled in the following 'if' statement anyway, so there is no need to show this warning to the user. This fixes #164. Signed-off-by: Jakob Kronawitter <[email protected]>
To ensure that there are no regressions in the future, the case that an empty data.frame is given to the function 'get.author.class' needs to be incorporated in the test suite. This relates to issue #164. Signed-off-by: Claus Hunsen <[email protected]>
Given the corner case that an empty network is given to the function 'add.vertex.attribute.*' or none of the vertices in the network is assigned a value, the vertex attribute is now added manually and by force using 'add.attributes.to.network'. The default value is then assigned as defined by the immediate call to 'add.vertex.attribute.*'. Two test cases are added: - addition of vertex attributes to empty networks, and - addition of vertex attributes to non-empty networks, but with usage of the default value (adaptation of first new test case). Additionally, to foster readability inside the function 'add.vertex.attribute', the local variable 'attr.df' is renamed to 'attrs.by.vertex.name' – since it is no data.frame. This fixes #165. Signed-off-by: Claus Hunsen <[email protected]> Signed-off-by: Thomas Bock <[email protected]>
For the case that a data.frame with less than two columns is given to the function 'get.author.class', the input data is reset to coincide with the specification given in the function documentation. This is related to #164 and 8060caa. Additionally, a short improvement to the documentation of 'get.author.class', as the column denoted by 'calc.base.name' does not necessarily need to be the second column. Signed-off-by: Claus Hunsen <[email protected]> Signed-off-by: Thomas Bock <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
To convey our goal and the acronym 'coronet' stands for, a short explanation on the library name is added to the file 'README.md'. This relates to #10, 929f8ce, and has been suggested by @bockthom. Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
…edges When a network or network range contains only one vertex and no edges, the classification into core and peripheral developers now classifies the one author as core. [Claus: Add second test, apply coding conventions, and do small refactorings. Adjust commit title.] Signed-off-by: Christian Hechtl <[email protected]> Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
Project renaming and minor fixes Reviewed-by: Thomas Bock <[email protected]>
For consistency, the .Rproj file that is distributed with the repository is renamed to reflect the upcoming name change of the repository. Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
As everything has been reviewed already, I will merge as soon as the TravisCI tests pass. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
3.5
This is the PR for releasing the version 3.5 of coronet. Thank you very much for all contributions, props to all contributors!
Announcement
coronet
(Rename the project #10, 929f8ce, ac1ce80)Added
UNTRACKED.FILE
,UNTRACKED.FILE.EMPTY.ARTIFACT
, andUNTRACKED.FILE.EMPTY.ARTIFACT.TYPE
: Commits that do not change any artifact are considered to be carried out on a meta-file called<untracked.file>
. The constantUNTRACKED.FILE
is added to hold the string constant. Analogously, the constantsUNTRACKED.FILE.EMPTY.ARTIFACT
(currently,""
) andUNTRACKED.FILE.EMPTY.ARTIFACT.TYPE
(currently,""
) hold the constants for any artifacts and their corresponding types, respectively, "changed" in untracked files. (11428d9, 5ea65b9, dde0dd7, 2284bbe)ProjectData$get.commits.filtered.uncached
: The method allows for external filtering of the commits by specifying if untracked files and/or the base artifact should be filtered (this method does not take advantage of caching, whereas the methodProjectData$get.commits.filtered
does) (11428d9)commits.filter.base.artifact
andcommits.filter.untracked.files
to theProjectConf
: In addition to theProjectConf
parametercommits.filter.base.artifact
(previously calledartifact.filter.base
), which configured whether the base artifact should be included in theget.commits.filtered
method, there is now a similar parameter calledcommits.filter.untracked.files
doing the same thing for untracked files (11428d9, 466d8eb)edges.for.base.artifacts
toNetworkConf
: In author networks, edges do not get constructed anymore between authors for solely modifying untracked files. For authors involved in changing the base artifact, it can be configured whether edges should be created or not using the newNetworkConf
parameteredges.for.base.artifacts
(c60c2f6, 466d8eb)ProjectData$get.authors.by.data.source
to retrieve authors by given data-source name (Change commit filtering and network building regarding the untracked files and base artifact #149, 6580427, 137d833)create.empty.data.frame
: The function returns empty data.frames (0 rows) with correct columns and, if specified, all the correct data types. In the future, functions, that return data in data.frames, should always return data.frames of the same shape (regarding columns and data types) – especially when they are empty – because this makes later case distinctions easier or unnecessary (67a4fbe, 3513647)create.empty.authors.list
,create.empty.commits.list
,create.empty.issues.list
,create.empty.mails.list
,create.empty.synchronicity.list
,create.empty.pasta.list
as well as corresponding constants holding columns and associated data types for all these empty data.frames (5f0f529, 523daef, f8e021d, 3513647, 2f4e6f0, cd3e34a)create.empty.network
if wanted (cae9d4b, cc8bd86)create.empty.vertex.list
(c00101d)restrict.classification.to.authors
to the functionsget.author.class.by.type
,get.author.class.overview
,get.author.class.network.degree
,get.author.class.network.eigen
,get.author.class.network.hierarchy
,get.author.class.commit.count
andget.author.class.loc.count
. The parameter allows to perform classifications on a limited group of authors whose names are specified in this parameter. (2492dd0, Optimizations for network-based core-peripheral classification #148)util-core-peripheral.R
by adding the new filetest-core-peripheral.R
along with test cases (2627d6c)issues.from.source
to choose if only issues from JIRA, only issues from GitHub, or all issues shall be read in (PR Add possibility to choose issue source #159, d677949, a3e7132, ea26181). Therefore two test cases, one that reads in only JIRA issues and one that reads in only GitHub issues, are added to the issue read test (65b1acd, 2d897cb)Changed/Improved
read.issues
inutil-read.R
now supports the new issue data format (PR Adjust network library to the new issue data format #147, 77c750c, e04ce30, 67b818a, 4020487, 3513647). Therefore, the test issue data and all related tests are updated (39971ee, 0ec6c6c, 6a9f4ad, fda000f, 3513647)ProjectConf
parameterartifact.filter.base
tocommits.filter.base.artifact
(PR Change commit filtering and network building regarding the untracked files and base artifact #149, 466d8eb)BASE.ARTIFACTS
is extended by adding untracked files (i.e. the new meta-fileUNTRACKED.FILE
), which is now considered to be a new base artifact in the case of file-level analyses. This implies, that, in case of file-level analyses, the base artifact and the untracked files fall together, while in feature-level and function-level analyses they are treated differently (d11d0fb)"Feature"
or"FeatureExpression"
) is now being done in the methodProjectData$get.commits
instead of the methodProjectData$get.commits.filtered
(894c9a5)get.commits.filtered.empty
and correspondingfilter.commits.empty
method, the functionality is now included into the methodsget.commits.filtered
andfilter.commits
respectively (11428d9)ProjectData$filter.commits
now takes parameters which configure whether untracked files and/or the base artifact are to be filtered (11428d9)get.commits.raw
,set.commits.raw
andread.commits.raw
functions (64a9486, c26e582)Conf
(and its sub-classesNetworkConf
andProjectConf
), default parameters are not validated anymore to avoid confusion by logging output (ec8c6dd)Conf
(and its sub-classesNetworkConf
andProjectConf
),stop
is called on errors during parameter updates now (ec8c6dd)Vertices
in the legend of plots to avoid confusion (f4fb480)ProjectData$get.cached.data.sources
to be more concise (a4e7a21)roxygen2
conventions (Improvement of the documentation conventions #157, fbc2d54, 783ee58, 6e33d0a)get.author.class.by.type
,get.author.class.overview
,get.author.class.network.degree
,get.author.class.network.eigen
,get.author.class.network.hierarchy
,get.author.class.commit.count
andget.author.class.loc.count
. Most importantly, the parameterrange.data
was renamed toproj.data
for these functions. (587ef99, 81568b1, Update core-peripheral module #70)get.commit.count.threshold
andget.loc.count.threshold
. (2534d73, Update core-peripheral module #70)verify.argument.for.parameter
was adjusted to be suitable in more general use-cases (557bdcd)metrics.node.degrees
is renamed tometrics.vertex.degrees
. (d35ce61)get.author.class.activity
andget.author.class.activity.overview
from the fileutil-core-peripheral.R
(61b344a)get.commit.data
fromutil-data.R
and replace all calls to this function with statements of equivalent functionality despite the fact that they are now retrieving the commit data viaget.commits.filtered
instead ofget.commits
which was internally used in the functionget.commit.data
(Update core-peripheral module #70, 4fc6b45, 7fc454e, c4cf8d2)active.ranges
should be computed per activity type or over all activity types (Further vertex attributes #92, aba8af9, 1bb81e8, 8f35a6b)first.activity
, the default value is now used analogous to active-ranges computation: The given value is used as default per author and type. (Further vertex attributes #92, 18a065c, edf864a)Fixed
construct.network.from.edge.list
(01f31d6)ProjectData
environment (c64cab8)tzone
onPOSIXct
items (5f6cc69)get.author.class
when usingresult.limit
and when classifying zero or passing invalid input (9437b4f, Function 'get.author.class' does crash when zero authors are to be classified. #164, d93b906, 8060caa, 70e4de5)