Skip to content
This repository has been archived by the owner on May 22, 2019. It is now read-only.

Split sourced-ml package to algorithms and data collection parts #396

Open
zurk opened this issue Mar 27, 2019 · 1 comment
Open

Split sourced-ml package to algorithms and data collection parts #396

zurk opened this issue Mar 27, 2019 · 1 comment
Assignees

Comments

@zurk
Copy link
Contributor

zurk commented Mar 27, 2019

Dependent projects such as https://github.com/src-d/style-analyzer need only algorithms part of the sourced-ml: https://github.com/src-d/ml/tree/master/sourced/ml/algorithms

Data collection part uses deprecated jgit-spark-connector which depends on old packages. This leads to unpleasant dependency conflicts: https://github.com/src-d/style-analyzer/pull/719/files#diff-354f30a63fb0907d4ad57269548329e3R30

That is why we should split the package into two parts.

@Guillemdb
Copy link

Guillemdb commented Apr 8, 2019

I am currently trying to make sense of the src-d/ml package, and I am building a map of how the different files depend on each other. I will be updating it during this week.

I hope that when it's finished it helps splitting the package in two.

src-d/ml files

Edit link

The different colors mean the following:

  • Blue: name of the module
  • Green: Files that do not depend on bblfsh
  • Yellow: Files that depend on bblfsh, but only only to use features of a UAST that "should be stable". (Role type, parents of a node, children, etc.)
  • Orange: Files that depend on bblfish that may need changes when using graphs instead of trees.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants