This code was used in a year-long study looking at CHAOSS and other sustainability metrics that might affect open communities that user version control hosting sites for collaboration. This mostly ended up being and/or open source, open science communities, and mostly on GitHub.
There are four main components within this code.
- One is the data visualisation component, which runs on Jekyll, with lots of javascript-based processing, since Jekyll can generate static sites but doesn't always allow advanced data manipulation. The entry point for this section is in the
view
folder. - The rest are data collection components.
- General: This is fully written in Javascript. The entry point is in index.js in this folder. Entire suite of metrics-gathering.
- Individual methods: JS-based. Entry: singleMethod.js. Useful to gather all of a single metric from all repos.
- Local methods: clone git and run some metrics locally rather than infuriating the github api.
- You'll need a recent version of node. Run
npm install
to get the dependencies. - in your .bashrc or .zshrc set up a githup api access token. It should look somethingg like this:
EXPORT github_sustain_sw_token=123456678sdfsdfsdfsdfsdfsdf
- run
node index.js
in yer terminal
- Full run, on ALL the repos you have in a text file, separated by newlines:
node index.js --month 12 --urlList /path/to/urllist.txt
In some cases, using the GitHub API is not the most efficient way to handle things - generally repos with LOTS of commits. Cloning stuff locally and assessing the logs works for anything that's git specific rather than GitHub specific.
To run the local scripts:
Setup:
- Create a directory in the parent directory of this repo, and call it
localData
. - Copy sample.tsv into localData.
- Tweak the repo names and dates, and/or add any lines you need to, to add more repos.
- You may also need to give the script permissions to run using
chmod +x
The above setup steps, as one copy-pastable block:
```bash
#go to the folder containing this code.
cd sustainable-communities-tracker
#change the script to be executable
chmod +x src/localMethods/localMethods.sh
# make a folder to store all the output data
mkdir ../localData
#copy the sample data to the data folder.
cp templates/sample.tsv ../localData/sample.tsv
```
To run the local script, after the above setup is complete:
npm run localMethods
Once you've run stats on a repo, there's a minimal UI to view the json and data more visually. To use it:
- copy files generated by the script(s) to
view/_data
- set up jekyll if you haven't already (
gem install jekyll bundler
). I use rvm to manage ruby versions, it makes things easier. - once it's all set up, cd into the view directory
cd view
and runbundle exec jekyll serve
- presto, you'll serve the visualisations.
Note that a copy of this repo, containing sanitised data and ready for deployment, is available here: https://github.com/Sustainable-Open-Science-and-Software/survey-datavis/actions