Skip to content

Latest commit

 

History

History
278 lines (230 loc) · 9.2 KB

README.org

File metadata and controls

278 lines (230 loc) · 9.2 KB

Your Code as a Crime Scene Notes

Code Maat repo

cd ~/projects/code-maat/

Build with

lein uberjar

Run with

java -jar code-maat-1.1-SNAPSHOT-standalone.jar -l logfile.log --version-control git

Chapter 3

.mailmap files

Used for handling Authors with name changes in git repos:

https://stacktoheap.com/blog/2013/01/06/using-mailmap-to-fix-authors-list-in-git/

Working git commands

git checkout $(git rev-list --max-count=1 --before="2013-12-01" master)


git log --numstat


git log --pretty=format:'[%h] %an %ad %s ' --date=short --numstat --before=2013-12-01 > maat_evo.log

Revisions analysis

number-of-entities
the number of “modules in the system”
number-of-entities-changed
the number of times “different modules in the system have been changed”
./script/maat -l maat_evo.log -c git -a revisions entity,n-revs

Complexity analysis

Measuring lines of code is a poor measure of complexity, Oram & Wilson, 2010 Making Software: What Really Works and Why We Believe It

cloc ./ --by-file --csv --quiet

Saving CSVs

./script/maat -l maat_evo.log -c git -a revisions > maat_freqs.csv

cd ../code-maat/
cloc ./ --by-file --csv --quiet --report-file=~/projects/your-code-as-a-crime-scene/maat_lines.csv
cd -

Merging the data complexity and change frequency data

python ~/projects/maat-scripts/merge/merge_comp_freqs.py maat_freqs.csv maat_lines.csv

Chapter 4

Using https://github.com/hibernate/hibernate-orm.

cd ~/projects/hibernate-orm/
git checkout $(git rev-list -n 1 --before="2013-09-05" master)

git log --pretty=format:'[%h] %an %ad %s' --date=short \
    --numstat --before=2013-09-05 --after=2012-01-01 > hib_evo.log

mv hib_evo.log ~/projects/your-code-as-a-crime-scene/hib_evo.log
cd -

./script/maat -l hib_evo.log -c git -a summary

Choosing a time frame

On my projects I choose the following timeframes

Between releases
Compare hotspots between releases to evaluate your long-term improvements.
Over iterations
If you work iteratively, measure between each iteration. This lets you spot code that starts to grow into hotspots early.
Around significant events
Define the temporal period around significant events, such as reorganizations of code or personnel. When you make large redesigns or change the way you work, it will reflect in the code.
Start with a long period
1 to 2 years will allow you to spot long-term trends.

Hibernate complexity

cd ~/projects/hibernate-orm/
cloc ./ --by-file --csv --quiet --report-file=~/projects/your-code-as-a-crime-scene/hib_lines.csv
cd -

Hibernate change frequencies

./script/maat -l hib_evo.log -c git -a revisions > hib_freqs.csv

Merging the hibernate data

python ~/projects/maat-scripts/merge/merge_comp_freqs.py hib_freqs.csv hib_lines.csv

Visualization samples directory

~/projects/code-maat-samples

Serving the goods

python -m SimpleHTTPServer 8888
open http://localhost:8888/hibernate/hibzoomable.html

Enclosure diagrams

Uses circle packing to represent the system. Each circle is a module, from hibernate core all the way down to individual namespaces. Circle radius denotes the lines of code contained in the modules dependent on the module the circle represents. Color intensity of the module denotes the number of revisions.

It looks like directories are blue and .java files are red.

Hotspots and individual coding styles

A project’s success depends on the coding skills of the people involved. As humans we vary a lot in our capabilities. A cognitively demanding task like programming amplifies those variations. That’s one reason why there’s a large difference in quality between the code of different programmers, even on the same team. Given these differences, it is hardly surprising that the expertise and coding techniques of individual developers make up one reason for clusters of hotspots.

Specialization

If one developer is designing all the modules in one area then there will be hotspots in that area over a given time. “This may be one reason why hotspots attract each other.”

Strategic decisions

Running up deliberate technical debt to meet a deadline.

Lack of knowledge

There may be a lack of knowledge in the team that is working on this part of the codebase.

There may be an opportunity to get training for the team or shuffle roles to add coverage to a blind spot

Does measuring code change improve fault prediction?

by Robert Bell, Thomas Ostrand, and Elaine Weyuker

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.281.7784&rep=rep1&type=pdf

Different features stabilize at different rates

“Change patterns suggest new modular boundaries”

Consider a namespace with three functions. Two of those functions depend on several other namespaces. One function is self-contained. The namespace undergoes changes whenever the two dependent functions change and the self-contained function comes along for the ride. A naive developer could accidentally change the self-contained function because they are looking at the namespace, when in fact they should only be concerned with changing the dependent functions.

Or the example from the book:

[Code Maat’s] app.clj module has changed several times, each time for different reasons. It has three responsibilities, and therefore three reasons to change. Refactoring this module into three independent ones would isolate the responsibilities and stabilize the code.

It may or may not stabilize the code, but at least it would allow the three refactored namespaces to evolve at rates consistent with the dependent namespaces for which they are responsible.

Visualizing hibernate data

python ~/projects/maat-scripts/transform/csv_as_enclosure_json.py --structure hib_lines.csv --weights hib_freqs.csv --weightcolumn 1 > hib_hotspot.json

Evolve your modules to stability as soon as possible

See http://michaelfeathers.typepad.com/michael_feathers_blog/2012/12/the-active-set-of-classes.html

  • Graph time when a class is related to its last modification
  • Calculate class closure as the time at which a class is no longer modified

Chapter 6

Rough complexity metric based on whitespace

python ~/projects/maat-scripts/miner/complexity_analysis.py ~/projects/hibernate-orm/hibernate-core/src/main/java/org/hibernate/cfg/Configuration.java

n,total,mean,sd,max
3389,8180,2.41,1.63,14
n
number of lines in the file
total
total accumulated indentations detected
mean
average indentations per line
sd
standard deviation of indentations per line
max
maximum indentations per line

Whew, 14!

Compare to running the code on the main dev branch:

n,total,mean,sd,max
769,933,1.21,0.76,6

All numbers reduced.

Large amounts of indenting suggest the presence of application logic

Law of increasing complexity

As an evolving program is continually changed, its complexity, reflecting deterioritating structure, increases unless work is done to maintain or reduce it.

Lehman, 1980

Trends

cd ~/projects/hibernate-orm/

python ~/projects/maat-scripts/miner/git_complexity_trend.py --start ccc087b --end 46c962e --file hibernate-core/src/main/java/org/hibernate/cfg/Configuration.java
rev,n,total,mean,sd
e75b8a77b1,3080,7610,2.47,1.76
23a62802c8,3092,7649,2.47,1.76
89911003e3,3100,7658,2.47,1.76
8373871c30,3101,7658,2.47,1.76
fa1183f3f9,3101,7658,2.47,1.76
5068b8e808,3101,7658,2.47,1.76
5671de517d,3105,7667,2.47,1.76
16d11f8422,3105,7667,2.47,1.76
ad2a9ef651,3165,7786,2.46,1.75
7f10972048,3165,7786,2.46,1.75
4ad49a02c9,3165,7786,2.46,1.75
8e30a2b86d,3168,7790,2.46,1.75
4e434f6197,3170,7794,2.46,1.75
4204f2c5fe,3168,7780,2.46,1.74
41397f22d1,3169,7788,2.46,1.75
12c7ab93c3,3159,7776,2.46,1.74
7b9b9b39c0,3163,7786,2.46,1.74
e7b188c924,3174,7819,2.46,1.74
7976e2396a,3174,7819,2.46,1.74
9ab924041d,3174,7819,2.46,1.74
4d68ddf7b0,3175,7819,2.46,1.74
2725a7d49e,3177,7825,2.46,1.74
0e2fd9f970,3176,7822,2.46,1.74
3335710a38,3215,7951,2.47,1.74
8515ce197a,3215,7951,2.47,1.74
394458f6a6,3220,7961,2.47,1.74
42f3422720,3219,7958,2.47,1.74
fbdca39506,3217,7951,2.47,1.74
b5457f37e2,3231,8025,2.48,1.74
d184cb3eb4,3247,8072,2.49,1.75
cf921df1d0,3249,8082,2.49,1.75
377c300071,3243,7873,2.43,1.65
04fe84994d,3280,7951,2.42,1.64
9030fa015e,3276,7919,2.42,1.64
8c95a6077a,3290,7950,2.42,1.64
14993a4637,3308,7991,2.42,1.63
a03d44f290,3309,7993,2.42,1.63
5ea40ce3f5,3319,8022,2.42,1.63
1825a4762c,3317,8022,2.42,1.63
580a71331c,3335,8072,2.42,1.63

See chart here https://docs.google.com/spreadsheets/d/1iLjZpN-LlbkIWcFB3T749yKjtj9bCY4wjdk2HGHT754/edit#gid=0