Wrong network construction for networks with no edges and more than one node #150

bockthom · 2018-12-16T19:14:00Z

Description

When we construct a network which has zero edges and more than one node, all the nodes get combined to one single node.

Thanks to @SCPhantom for pointing this out!

Steps to Reproduce or Minimal Working Example (MWE)

The MWE of @SCPhantom:

Replace the content of the file ./sample/results/testing/sample_feature/feature/commits.list by the following lines:

1;"2016-07-12 15:58:59";"D1";"[email protected]";"2016-07-12 15:58:59";"D1";"[email protected]";"6409a8f399b81919729d14554b923d494b6b4229";1;1;1;2;"a1.c";"A1";"Feature";1

2;"2016-07-12 15:59:10";"D2";"[email protected]";"2016-07-12 15:59:10";"D2";"[email protected]";"722ca215917e3f22348b993acfc4fae578b4a4c9";1;1;1;2;"a1.c";"A2";"Feature";1

Run the following lines in R:

source("codeface-extraction-r/util-init.R", chdir = TRUE)

DATA.PATH ="codeface-extraction-r/sample"
CASESTUDY = "sample"
ANALYSIS.RANGE.TYPE <- "testing"

ARTIFACT = "feature"
AUTHOR.RELATION = "cochange"
ARTIFACT.RELATION <- "cochange"

proj.conf = ProjectConf$new(DATA.PATH, ANALYSIS.RANGE.TYPE, CASESTUDY, ARTIFACT)
proj.data = ProjectData$new(proj.conf)


net.conf = NetworkConf$new()

net.conf$update.values(list(author.relation = AUTHOR.RELATION,
                            artifact.relation = ARTIFACT.RELATION,
                            base.artifact.edges = F,
                            simplify = F))

net.builder = NetworkBuilder$new(proj.data, net.conf)

network = net.builder$get.author.network()

pdf(file = "network_sample_feature_filter_base.pdf", width = 8, height = 5)
plot.network(network)
dev.off()

The resulting network should contain two nodes, namely "D1" and "D2". However, currently there is only one node:

Versions

This affects several versions. I tested it with v3.4.

Problem & Solution

After debugging the above stated MWE, I identified a bug in line 1041 of util-networks.R:

https://github.com/se-passau/codeface-extraction-r/blob/83513c701a1cd013aff5ebda152e2b38789d04e3/util-networks.R#L1039-L1042

Here, nodes.processed is a data.frame containing the developers "D1" and "D2" as rows (one column). Hence, igraph::vertices(nodes.processed) treats this column vector as one node (for whatever reason, I did not find anything about that in the documentation of igraph).

Solution: Transpose the data.frame! (Use t function). The end of line 1041 should look like the following:

igraph::vertices(t(nodes.processed))

This solves the problem! We now have two columns and therefore two nodes "D1" and "D2"!

I will provide a patch for that very soon!

The text was updated successfully, but these errors were encountered:

clhunsen · 2018-12-17T09:50:28Z

Thank you both for working on this! This is a regression introduced during the addition of the muti-relation networks (somewhere between v3.1 and v3.2).

Instead of simply transposing the data.frame, we should go to use igraph::graph.data.frame and create.empty.edge.list instead. I am currently working on an additional patch to introduce this change and a corresponding test to prevent the issue in the future.

clhunsen · 2018-12-17T15:33:19Z

Here the gist with my proposed patch: https://gist.github.com/clhunsen/f6332ceac29630476ecbe816839846d8.

clhunsen · 2019-01-15T08:54:29Z

With the merging of PR #149, this can be closed.

When a network contains no edges but more than one node, all the nodes get combined. To fix this, the respecting data frame, which contains the nodes, has to be transposed. This fixes se-sic#150. Reported-by: Jakob Kronawitter <[email protected]> Signed-off-by: Thomas Bock <[email protected]>

When constructing a network in 'construct.network.from.edge.list', several corner cases need to be handled. When there are no edges available, an empty edge list can be created using 'create.empty.edge.list'. This way, reliably, the function 'igraph::graph.data.frame' can be used to construct a network. This further improves the patch 0d7c222, which tackles se-sic#150. Tests for creating edgeless networks are added to the file 'tests/test-networks.R'. This likely prevents regressions in the future. Additionally, use the function 'create.empty.edge.list' in one further place where possible. Signed-off-by: Claus Hunsen <[email protected]>

bockthom added the bug label Dec 16, 2018

bockthom added this to the v3.5 milestone Dec 16, 2018

bockthom added fixed/in-progress fixed/has-pr and removed fixed/in-progress labels Dec 16, 2018

clhunsen added fixed/in-progress and removed fixed/has-pr labels Dec 17, 2018

clhunsen added fixed/has-pr and removed fixed/in-progress labels Dec 19, 2018

clhunsen mentioned this issue Jan 6, 2019

Remove empty artifact as vertex (PR #149) #153

Closed

clhunsen added fixed/future-release and removed fixed/has-pr labels Jan 15, 2019

clhunsen closed this as completed Jan 15, 2019

clhunsen mentioned this issue Feb 27, 2019

Fix, improve, and add many, many things #158

Merged

6 tasks

clhunsen mentioned this issue Jun 7, 2019

Version 3.5 #168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong network construction for networks with no edges and more than one node #150

Wrong network construction for networks with no edges and more than one node #150

bockthom commented Dec 16, 2018

clhunsen commented Dec 17, 2018

clhunsen commented Dec 17, 2018

clhunsen commented Jan 15, 2019

Wrong network construction for networks with no edges and more than one node #150

Wrong network construction for networks with no edges and more than one node #150

Comments

bockthom commented Dec 16, 2018

Description

Steps to Reproduce or Minimal Working Example (MWE)

Versions

Problem & Solution

clhunsen commented Dec 17, 2018

clhunsen commented Dec 17, 2018

clhunsen commented Jan 15, 2019