Skip to content

Commit

Permalink
Merge pull request #190 from se-sic/dev
Browse files Browse the repository at this point in the history
Version 3.7

Merged-by: Thomas Bock <[email protected]>
  • Loading branch information
bockthom authored Dec 2, 2020
2 parents 91fc448 + af4eaa6 commit a7f4123
Show file tree
Hide file tree
Showing 24 changed files with 2,161 additions and 231 deletions.
2 changes: 2 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
## 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
##
## Copyright 2017-2018,2020 by Claus Hunsen <[email protected]>
## Copyright 2020 by Thomas Bock <[email protected]>
## All Rights Reserved.

# TravisCI container
Expand All @@ -26,6 +27,7 @@ r:
- 3.4
- 3.5
- 3.6
- 4.0
cache: packages
repos:
CRAN: https://cloud.r-project.org
Expand Down
10 changes: 5 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contributing to the network library `coronet`

The following is a set of guidelines for contributing to the network library `coronet`, which is hosted in the [se-passau](https://github.com/se-passau) organization on GitHub.
The following is a set of guidelines for contributing to the network library `coronet`, which is hosted in the [se-sic](https://github.com/se-sic) organization on GitHub.
These are mostly guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.

#### Table Of Contents
Expand Down Expand Up @@ -39,7 +39,7 @@ Before creating bug reports, please check [this list](#before-submitting-a-bug-r
#### Before Submitting A Bug Report

* **Check the code.**
You might be able to find the cause of the problem and fix things yourself. Most importantly, check if you can reproduce the problem in the latest version of the library (see [branch `dev`](https://github.com/se-passau/coronet/tree/dev)).
You might be able to find the cause of the problem and fix things yourself. Most importantly, check if you can reproduce the problem in the latest version of the library (see [branch `dev`](https://github.com/se-sic/coronet/tree/dev)).
* **Search for previous issues describing the same problem.**
If an old issue includes also a fix or a workaround for your problem, you do not need to file a new issue. Although, if the problem still persists after applying potential fixes, please file a new issue including detailed information to reproduce the problem. If there is an old issue that is still open, add a comment to the existing issue instead of opening a new one.
* **Run the test suite.**
Expand Down Expand Up @@ -112,8 +112,8 @@ In our development process, we pursue the following idea:
- The current development will be performed on the branch `dev`, i.e., all incoming pull requests are against this branch.

The current build status is as follows:
- `master`: [![Build Status](https://travis-ci.com/se-passau/coronet.svg?token=8VFPdy2kjPXtstT72yww&branch=master)](https://travis-ci.com/se-passau/coronet)
- `dev`: [![Build Status](https://travis-ci.com/se-passau/coronet.svg?token=8VFPdy2kjPXtstT72yww&branch=dev)](https://travis-ci.com/se-passau/coronet)
- `master`: [![Build Status](https://travis-ci.com/se-sic/coronet.svg?token=8VFPdy2kjPXtstT72yww&branch=master)](https://travis-ci.com/se-sic/coronet)
- `dev`: [![Build Status](https://travis-ci.com/se-sic/coronet.svg?token=8VFPdy2kjPXtstT72yww&branch=dev)](https://travis-ci.com/se-sic/coronet)


### Pull Requests
Expand All @@ -129,7 +129,7 @@ The current build status is as follows:
* Code must be reviewed by one other project member and, if needed, be properly adapted/fixed.
* We add the `Reviewed-by` tag only for the merge commit.

There will be another checklist for you when you open an actual pull request provided by [the corresponding template](.github/PULL_REQUEST_TEMPLATE/pull-request.md).
There will be another checklist for you when you open an actual pull request provided by [the corresponding template](.github/PULL_REQUEST_TEMPLATE.md).

## Style Conventions

Expand Down
27 changes: 24 additions & 3 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,29 @@
# coronet – Changelog

## 3.7

### Added
- Add a new file `util-tensor.R` containing the class `FourthOrderTensor` to create (author x relation x author x relation) tensors from a list of networks (with each network having a different relation) and its corresponding utility function `get.author.networks.for.multiple.relations` (PR #173, c136b1f6127d73c25f08ae2f317246747aa9ea2b, e4ee0dc926b22ff75d5fd801c1f131bcff4c22eb, 051a5f0287022f97e2367ed0e9591b9df9dbdb3d)
- Add function `calculate.EDCPTD.centrality` for calculating the EDCPTD centrality for a fourth-order tensor in the above described form (c136b1f6127d73c25f08ae2f317246747aa9ea2b, e4ee0dc926b22ff75d5fd801c1f131bcff4c22eb, 051a5f0287022f97e2367ed0e9591b9df9dbdb3d)
- Add new file `util-networks-misc.R` which contains miscellaneous functions for processing network data and creating and converting various kinds of adjacency matrices: `get.author.names.from.networks`, `get.author.names.from.data`, `get.expanded.adjacency`, `get.expanded.adjacency.matrices`, `get.expanded.adjacency.matrices.cumulated`, `convert.adjacency.matrix.list.to.array` (051a5f0287022f97e2367ed0e9591b9df9dbdb3d)
- Add tests for sliding-window functionality and make parameterized tests possible (a3ad0a81015c7f23bce958d5c1922e3b82b28bda, 2ed84ac55d434f62341297b1aa9676c12e383491, PR #184)
- Add function `cleanup.pasta.data` to remove wrong commit hashes and message ids from the PaStA data (1797e0324c39ad7b88dc22a14391340f4d26aea8, PR #189)

### Changed/Improved
- Adjust the function `get.authors.by.data.source`: Rename its single parameter to `data.sources` and change the function so that it can extract the authors for multiple data sources at once. The default value of the parameter is a vector containing all the available data sources (commits, mails, issues) (051a5f0287022f97e2367ed0e9591b9df9dbdb3d)
- Adjust recommended R version to 3.6.3 in README (92be262514277acb774ab2885c1c0d1c10f03373)
- Add R version 4.0 to test suite and adjust package installation in `install.R` to improve compatibility with Travis CI (40aa0d80e2a94434a8be75925dbefbde6d3518b2, 1ba036758a63767e2fcef525c98f5a4fd6938c39, #161)

### Fixed
- Fix sliding-window creation in various splitting functions (`split.network.time.based`, `split.networks.time.based`, `split.data.time.based`, `split.data.activity.based`, `split.network.activity.based`) and also fix the computation of overlapping ranges in the function `construct.overlapping.ranges` to make sure that the last and the second-last range do not cover the same range) (1abc1b8dbfc65ccad0cbbc8e33b209e39d2f8118, c34c42aef32a30b82adc53384fd6a1b09fc75dee, 097cebcc477b1b65056d512124575f5a78229c3e, 9a1b6516f490b72b821be2d5365d98cac1907b2f, 0fc179e2735bec37d26a68c6c351ab43770007d2, cad28bf221f942eb25e997aaa2de553181956680, 7602af2cf46f699b2285d53819dec614c71754c6, PR #184)
- Fix off-by-1 error in the function `get.data.cut.to.same.date` (f0744c0e14543292cccb1aa9a61f822755ee7183)
- Fix missing or wrongly set layout when plotting networks (#186, 720cc7ba7bdb635129c7669911aef8e7c6200a6b, 877931b94f87ca097c2f8f3c55e4b4bcc6087742)
- Fix reading of the PaStA data since the file format has changed (712bbafde3fb8f7b7c0fc847cb9c1838eb4cf86e, PR #189)
- Fix bug that duplicates revision set ids in the mail and commit data when merging the PaStA data and also copy-paste error when merging PaStA data to commit data (1797e0324c39ad7b88dc22a14391340f4d26aea8, PR #189)
- Fix bug that results in an error when there is a variable called 'c' in the R environment (de42eb24be131c261ccad7d807007f27d5559d68, PR #189)
- Fix bug that when applying `filter.patchstack.mails()` to an environment with no mail data, the mail data gets set to `NULL` (82614754fb3d75b0e5856d1ef42ada737859ee37, PR #189)


## 3.6

### Added
Expand All @@ -8,19 +32,16 @@
- Add a new file `util-plot-evaluation.R` containing functions to plot commit edit types per author and project. (PR #171, d4af515f859ce16ffaa0963d6d3d4086bcbb7377, aa542a215f59bc3ed869cfefbc5a25fa050b1fc9. 0a0a5903e7c609dfe805a3471749eb2241efafe2)

### Changed/Improved

- Add R version 3.6 to test suite (8b2a52d38475a59c55feb17bb54ed12b9252a937, #161)
- Update `.travis.yml` to improve compatibility with Travis CI (41ce589b3b50fd581a10e6af33ac6b1bbea63bb8)

### Fixed

- Ensure sorting of commit-count and LOC-count data.frames to fix tests with R 3.3 (33d63fd50c4b29d45a9ca586c383650f7d29efd5)


## 3.5

### Announcement

- Rename project to `coronet` (#10, 929f8cec7b52adef1389ce1691b783c235eb815d, ac1ce80b9f5da812f90b5fed63f26dc8c812a4d6)
* Be sure to update Git remotes and submodules to the new URL!

Expand Down
17 changes: 13 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ If you wonder: The name `coronet` derives as an acronym from the words "configur

- [Integration](#integration)
* [Requirements](#requirements)
* [R](#r-331)
* [R](#r)
* [packrat (recommended)](#packrat)
* [Folder structure of the input data](#folder-structure-of-the-input-data)
* [Needed R packages](#needed-r-packages)
Expand Down Expand Up @@ -53,9 +53,11 @@ If you wonder: The name `coronet` derives as an acronym from the words "configur

While using the package, we require the following infrastructure.

#### [`R`](https://www.r-project.org/) `3.3.1`
#### [`R`](https://www.r-project.org/)

Later `R` versions should work (and are tested using our TravisCI script), but, for reliability reasons and `packrat` compatibility, only version `3.3.1` is supported.
Minimum requirement is `R` version `3.3.1`. Hence, later `R` versions also work.

We currently recommend version `3.6.3` for reliability reasons and `packrat` compatibility, but also later versions (`>=4`) should work (and are tested using our TravisCI script).

#### [`packrat`](http://rstudio.github.io/packrat/) (recommended)

Expand Down Expand Up @@ -122,12 +124,15 @@ Alternatively, you can run `Rscript install.R` to install the packages.
- `logging`: Logging
- `sqldf`: For advanced aggregation of `data.frame` objects
- `testthat`: For the test suite
- `patrick`: For the test suite
- `ggplot2`: For plotting of data
- `ggraph`: For plotting of networks (needs `udunits2` system library, e.g., `libudunits2-dev` on Ubuntu!)
- `markovchain`: For core/peripheral transition probabilities
- `lubridate`: For convenient date conversion and parsing
- `viridis`: For plotting of networks with nice colors
- `jsonlite`: For parsing the issue data
- `rTensor`: For calculating EDCPTD centrality
- `Matrix`: For sparse matrix representation of large adjacency matrices

### Submodule

Expand Down Expand Up @@ -409,6 +414,10 @@ Additionally, for more examples, the file `showcase.R` is worth a look.
* Functionality to add vertex attributes to existing networks
- `util-networks-metrics.R`
* A set of network-metric functions
- `util-networks-misc.R`
* Helper functions for network creation (e.g., create adjacency matrices)
- `util-tensor.R`
* Functionality to build fourth-order tensors
- `util-core-peripheral.R`
* Author classification (core and peripheral) and related functions
- `util-motifs.R`
Expand Down Expand Up @@ -630,4 +639,4 @@ This project is licensed under [GNU General Public License v2.0](LICENSE).

## Work in progress

To see what will be the next things to be implemented, please have a look at the [list of issues](https://github.com/se-passau/coronet/issues).
To see what will be the next things to be implemented, please have a look at the [list of issues](https://github.com/se-sic/coronet/issues).
10 changes: 8 additions & 2 deletions install.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
## Copyright 2015 by Wolfgang Mauerer <[email protected]>
## Copyright 2015-2017 by Claus Hunsen <[email protected]>
## Copyright 2017 by Thomas Bock <[email protected]>
## Copyright 2020 by Thomas Bock <[email protected]>
## Copyright 2019 by Anselm Fehnker <[email protected]>
## All Rights Reserved.
##
Expand All @@ -32,12 +33,15 @@ packages = c(
"logging",
"sqldf",
"testthat",
"patrick",
"ggplot2",
"ggraph",
"markovchain",
"lubridate",
"viridis",
"jsonlite"
"jsonlite",
"rTensor",
"Matrix"
)


Expand All @@ -53,5 +57,7 @@ filter.installed.packages = function(packageList) {
p = filter.installed.packages(packages)
if (length(p) > 0) {
print(sprintf("Installing package '%s'.", p))
install.packages(p, dependencies = TRUE, verbose = FALSE, quiet = FALSE)

## set dependencies to 'NA' to install only necessary dependencies (i.e., "Depends", "Imports", "LinkingTo")
install.packages(p, dependencies = NA, verbose = TRUE, quiet = TRUE)
}
21 changes: 17 additions & 4 deletions showcase.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,15 @@
## with this program; if not, write to the Free Software Foundation, Inc.,
## 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
##
## Copyright 2016-2018 by Claus Hunsen <[email protected]>
## Copyright 2016-2018, 2020 by Claus Hunsen <[email protected]>
## Copyright 2017 by Raphael Nömmer <[email protected]>
## Copyright 2017 by Christian Hechtl <[email protected]>
## Copyright 2017 by Felix Prasse <[email protected]>
## Copyright 2017-2018 by Thomas Bock <[email protected]>
## Copyright 2020 by Thomas Bock <[email protected]>
## Copyright 2018 by Jakob Kronawitter <[email protected]>
## Copyright 2019 by Klara Schlueter <[email protected]>
## Copyright 2020 by Anselm Fehnker <[email protected]>
## All Rights Reserved.


Expand Down Expand Up @@ -122,6 +124,17 @@ x = NetworkBuilder$new(project.data = x.data, network.conf = net.conf)
# net = x$get.author.network()
# save(net, file = sprintf("busybox_%s.network", x$get.network.conf.variable(var.name = "author.relation")))

## / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
## Calculate EDCPTD centrality ---------------------------------------------

## get author networks for each relation
author.networks = get.author.networks.for.multiple.relations(x, c("cochange", "mail", "issue"))

## create fourth-order tensor
fourth.order.tensor = FourthOrderTensor$new(author.networks)

## calculate EDCPTD scores
edcptd.scores = calculate.EDCPTD.centrality(fourth.order.tensor)

## / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
## Range-level data --------------------------------------------------------
Expand Down Expand Up @@ -319,10 +332,10 @@ y = NetworkBuilder$new(project.data = y.data, network.conf = net.conf)
# panel.border = ggplot2::element_blank(),
# legend.position = "none"
# ) +
# ggraph::facet_edges( ~ edge.type.char)
# ggraph::facet_edges( ~ edge.type)
# # ggraph::facet_edges( ~ weight)
# # ggraph::facet_nodes( ~ vertex.type.char)
# # ggraph::facet_graph(edge.type.char ~ vertex.type.char)
# # ggraph::facet_nodes( ~ vertex.type)
# # ggraph::facet_graph(edge.type ~ vertex.type)
# print(p)

# ## generate network plot from README file and save it to disk
Expand Down
4 changes: 3 additions & 1 deletion tests.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
## 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
##
## Copyright 2017, 2019 by Claus Hunsen <[email protected]>
## Copyright 2020 by Thomas Bock <[email protected]>
## All Rights Reserved.

## / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
Expand Down Expand Up @@ -42,8 +43,9 @@ sessionInfo()

logging::loginfo("Running test suite.")

## load package 'testthat'
## load packages 'testthat' and 'patrick'
requireNamespace("testthat")
requireNamespace("patrick")

## starting tests
do.tests = function(dir) {
Expand Down
17 changes: 9 additions & 8 deletions tests/test-data-cut.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
## Copyright 2018 by Claus Hunsen <[email protected]>
## Copyright 2018 by Barbara Eckl <[email protected]>
## Copyright 2018 by Thomas Bock <[email protected]>
## Copyright 2020 by Thomas Bock <[email protected]>
## Copyright 2018 by Jakob Kronawitter <[email protected]>
## All Rights Reserved.

Expand Down Expand Up @@ -62,14 +63,14 @@ test_that("Cut commit and mail data to same date range.", {
artifact.type = c("Feature", "Feature"),
artifact.diff.size = as.integer(c(1, 1)))

mail.data.expected = data.frame(author.name = c("Thomas"),
author.email = c("[email protected]"),
message.id = c("<[email protected]>"),
date = get.date.from.string("2016-07-12 16:04:40"),
date.offset = as.integer(c(100)),
subject = c("Re: Fw: busybox 2 tab"),
thread = sprintf("<thread-%s>", c(9)),
artifact.type = "Mail")
mail.data.expected = data.frame(author.name = c("Thomas", "Olaf"),
author.email = c("[email protected]", "[email protected]"),
message.id = c("<[email protected]>", "<[email protected]>"),
date = get.date.from.string(c("2016-07-12 16:04:40", "2016-07-12 16:05:37")),
date.offset = as.integer(c(100, 200)),
subject = c("Re: Fw: busybox 2 tab", "Re: Fw: busybox 10"),
thread = sprintf("<thread-%s>", c(9, 9)),
artifact.type = c("Mail", "Mail"))

commit.data = x.data$get.data.cut.to.same.date(data.sources = data.sources)$get.commits()
rownames(commit.data) = 1:nrow(commit.data)
Expand Down
17 changes: 9 additions & 8 deletions tests/test-networks-cut.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
## Copyright 2017 by Christian Hechtl <[email protected]>
## Copyright 2018 by Claus Hunsen <[email protected]>
## Copyright 2018 by Thomas Bock <[email protected]>
## Copyright 2020 by Thomas Bock <[email protected]>
## Copyright 2018 by Jakob Kronawitter <[email protected]>
## All Rights Reserved.

Expand Down Expand Up @@ -62,14 +63,14 @@ test_that("Cut commit and mail data to same date range.", {
artifact.type = c("Feature", "Feature"),
artifact.diff.size = as.integer(c(1, 1)))

mail.data.expected = data.frame(author.name = c("Thomas"),
author.email = c("[email protected]"),
message.id = c("<[email protected]>"),
date = get.date.from.string(c("2016-07-12 16:04:40")),
date.offset = as.integer(c(100)),
subject = c("Re: Fw: busybox 2 tab"),
thread = sprintf("<thread-%s>", c(9)),
artifact.type = "Mail")
mail.data.expected = data.frame(author.name = c("Thomas", "Olaf"),
author.email = c("[email protected]", "[email protected]"),
message.id = c("<[email protected]>", "<[email protected]>"),
date = get.date.from.string(c("2016-07-12 16:04:40", "2016-07-12 16:05:37")),
date.offset = as.integer(c(100, 200)),
subject = c("Re: Fw: busybox 2 tab", "Re: Fw: busybox 10"),
thread = sprintf("<thread-%s>", c(9, 9)),
artifact.type = c("Mail", "Mail"))

commit.data = x$get.project.data()$get.commits()
rownames(commit.data) = 1:nrow(commit.data)
Expand Down
Loading

0 comments on commit a7f4123

Please sign in to comment.