-
Notifications
You must be signed in to change notification settings - Fork 0
/
ProfilingWithValgrind.html
281 lines (237 loc) · 15.2 KB
/
ProfilingWithValgrind.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
<!DOCTYPE html>
<html lang="en" data-content_root="./">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Profiling with Valgrind</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=fa44fd50" />
<link rel="stylesheet" type="text/css" href="_static/bootstrap-sphinx.css?v=fadd4351" />
<link rel="stylesheet" type="text/css" href="_static/custom.css?v=77160d70" />
<script src="_static/documentation_options.js?v=a8da1a53"></script>
<script src="_static/doctools.js?v=9bcbadda"></script>
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Flowchart Creation" href="FlowchartCreation.html" />
<link rel="prev" title="Profiling with perf" href="ProfilingWithPerf.html" />
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-59110517-1', 'auto');
ga('send', 'pageview');
</script>
</head><body>
<div id="navbar" class="navbar navbar-default ">
<div class="container">
<div class="navbar-header">
<!-- .btn-navbar is used as the toggle for collapsed navbar content -->
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".nav-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="http://www.mantidproject.org">
</a>
<span class="navbar-text navbar-version pull-left"><b>main</b></span>
</div>
<div class="collapse navbar-collapse nav-collapse">
<ul class="nav navbar-nav">
<li class="divider-vertical"></li>
<li><a href="index.html">Home</a></li>
<li><a href="https://download.mantidproject.org">Download</a></li>
<li><a href="https://docs.mantidproject.org">User Documentation</a></li>
<li><a href="http://www.mantidproject.org/contact">Contact Us</a></li>
</ul>
<form class="navbar-form navbar-right" action="search.html" method="get">
<div class="form-group">
<input type="text" name="q" class="form-control" placeholder="Search" />
</div>
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
<p>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="nav-item nav-item-0"><a href="index.html">Documentation</a> »</li>
<li class="nav-item nav-item-this"><a href="">Profiling with Valgrind</a></li>
</ul>
</div> </p>
</div>
<div class="container">
<div class="row">
<div class="body col-md-12 content" role="main">
<section id="profiling-with-valgrind">
<span id="profilingwithvalgrind"></span><h1>Profiling with Valgrind<a class="headerlink" href="#profiling-with-valgrind" title="Link to this heading">¶</a></h1>
<section id="summary">
<h2>Summary<a class="headerlink" href="#summary" title="Link to this heading">¶</a></h2>
<p>Describes how to use the
<a class="reference external" href="http://valgrind.org/docs/manual/cl-manual.html">callgrind</a> plugin to
the <a class="reference external" href="http://valgrind.org/">valgrind</a> tool set to profile your code.</p>
<p><em>Note: Valgrind is a Linux only tool</em></p>
</section>
<section id="installation">
<h2>Installation<a class="headerlink" href="#installation" title="Link to this heading">¶</a></h2>
<p>You will need to install both <em>valgrind</em> & the visualizer tool
<em>kcachegrind</em> - both of which should be available in your distribution’s
repositories.</p>
<section id="ubuntu">
<h3>Ubuntu<a class="headerlink" href="#ubuntu" title="Link to this heading">¶</a></h3>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">>></span> <span class="n">apt</span><span class="o">-</span><span class="n">get</span> <span class="n">install</span> <span class="n">valgrind</span> <span class="n">kcachegrind</span>
</pre></div>
</div>
</section>
<section id="red-hat-fedora">
<h3>Red Hat/Fedora<a class="headerlink" href="#red-hat-fedora" title="Link to this heading">¶</a></h3>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">>></span> <span class="n">yum</span> <span class="n">install</span> <span class="n">valgrind</span> <span class="n">kcachegrind</span>
</pre></div>
</div>
</section>
</section>
<section id="preparation">
<h2>Preparation<a class="headerlink" href="#preparation" title="Link to this heading">¶</a></h2>
<p>To be most effective valgrind requires that debugging information be
present in the binaries that are being instrumented, although a full
debug build is not required. On gcc this means compiling with the <em>-g</em>
flag. For Mantid the recommended setup is to use a separate build with
the <em>CMAKE_BUILD_TYPE</em> set to <em>RelWithDebInfo</em>. This provides a good
balance between performance and availability of debugging information.</p>
</section>
<section id="running-the-profiler">
<h2>Running the Profiler<a class="headerlink" href="#running-the-profiler" title="Link to this heading">¶</a></h2>
<p>The profiler can instrument the code in a number of different ways. Some
of these will be described here. For more detail information see the
<a class="reference external" href="http://valgrind.org/docs/manual/cl-manual.html">callgrind manual</a>.</p>
<p>During execution the callgrind tool creates many output file named
<em>callgrind.output.pid.X</em>. For this reason it is recommended that each
profile run be executed from within a separate directory, named with a
description of the activity. This allows the separate profiled runs to
be found more easily in the future.</p>
<p><strong>Beware</strong>: The code will execute many times (factors of 10) slower than
when not profiling - this is just a consequence of how valgrind
instruments the code.</p>
<section id="profile-a-whole-program-run">
<h3>Profile a Whole Program Run<a class="headerlink" href="#profile-a-whole-program-run" title="Link to this heading">¶</a></h3>
<p>This is the simplest mode of operation. Simply pass the program
executable, along with any arguments to the valgrind executable:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">>></span> <span class="n">valgrind</span> <span class="o">--</span><span class="n">tool</span><span class="o">=</span><span class="n">callgrind</span> <span class="o">--</span><span class="n">dump</span><span class="o">-</span><span class="n">instr</span><span class="o">=</span><span class="n">yes</span> <span class="o">--</span><span class="n">simulate</span><span class="o">-</span><span class="n">cache</span><span class="o">=</span><span class="n">yes</span> <span class="o">--</span><span class="n">collect</span><span class="o">-</span><span class="n">jumps</span><span class="o">=</span><span class="n">yes</span> <span class="o"><</span><span class="n">executable</span><span class="o">></span> <span class="p">[</span><span class="n">args</span><span class="o">...</span><span class="p">]</span>
</pre></div>
</div>
</section>
<section id="profile-selected-portions-of-the-code">
<h3>Profile Selected Portions of the Code<a class="headerlink" href="#profile-selected-portions-of-the-code" title="Link to this heading">¶</a></h3>
<p>For larger pieces of code it can be quite likely that you wish to
profile only a selected portion of it. This is possible if you have a
access to recompile the source code of the binaries to be instrumented
as valgrind has a C api that can be used talk to the profiler. It uses a
set of
<a class="reference external" href="http://valgrind.org/docs/manual/cl-manual.html#cl-manual.clientrequests">macros</a>
to instruct the profiler what to do when it hits certain points of the
code.</p>
<p>As an example take a simple main function composed of several function
calls:</p>
<div class="highlight-cpp notranslate"><div class="highlight"><pre><span></span><span class="kt">int</span><span class="w"> </span><span class="nf">main</span><span class="p">()</span>
<span class="p">{</span>
<span class="w"> </span><span class="n">foo1</span><span class="p">();</span>
<span class="w"> </span><span class="n">bar1</span><span class="p">();</span>
<span class="w"> </span><span class="n">foo2</span><span class="p">();</span>
<span class="w"> </span><span class="n">foo3</span><span class="p">();</span>
<span class="p">}</span>
</pre></div>
</div>
<p>To profile only the call to <code class="docutils literal notranslate"><span class="pre">bar1()</span></code> then you would do the following
to the code:</p>
<div class="highlight-cpp notranslate"><div class="highlight"><pre><span></span><span class="cp">#include</span><span class="w"> </span><span class="cpf"><valgrind/callgrind.h></span>
<span class="kt">int</span><span class="w"> </span><span class="nf">main</span><span class="p">()</span>
<span class="p">{</span>
<span class="w"> </span><span class="n">foo1</span><span class="p">();</span>
<span class="w"> </span><span class="n">CALLGRIND_START_INSTRUMENTATION</span><span class="p">;</span>
<span class="w"> </span><span class="n">CALLGRIND_TOGGLE_COLLECT</span><span class="p">;</span>
<span class="w"> </span><span class="n">bar1</span><span class="p">();</span>
<span class="w"> </span><span class="n">CALLGRIND_TOGGLE_COLLECT</span><span class="p">;</span>
<span class="w"> </span><span class="n">CALLGRIND_STOP_INSTRUMENTATION</span><span class="p">;</span>
<span class="w"> </span><span class="n">foo2</span><span class="p">();</span>
<span class="w"> </span><span class="n">foo3</span><span class="p">();</span>
<span class="p">}</span>
</pre></div>
</div>
<p>After recompiling the code you would then run the profiler again but
this time adding the <em>–instr-atstart=no –collect-atstart=no</em> flags,
like so</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">>></span> <span class="n">valgrind</span> <span class="o">--</span><span class="n">tool</span><span class="o">=</span><span class="n">callgrind</span> <span class="o">--</span><span class="n">dump</span><span class="o">-</span><span class="n">instr</span><span class="o">=</span><span class="n">yes</span> <span class="o">--</span><span class="n">simulate</span><span class="o">-</span><span class="n">cache</span><span class="o">=</span><span class="n">yes</span> <span class="o">--</span><span class="n">collect</span><span class="o">-</span><span class="n">jumps</span><span class="o">=</span><span class="n">yes</span> <span class="o">--</span><span class="n">collect</span><span class="o">-</span><span class="n">atstart</span><span class="o">=</span><span class="n">no</span> <span class="o">--</span><span class="n">instr</span><span class="o">-</span><span class="n">atstart</span><span class="o">=</span><span class="n">no</span> <span class="o"><</span><span class="n">executable</span><span class="o">></span> <span class="p">[</span><span class="n">args</span><span class="o">...</span><span class="p">]</span>
</pre></div>
</div>
<p>The <em>–instru-at-start=no</em> is not strictly necessary but it will speed
up the code up until the profiling point at the cost of the less
accuracy about the cache usage. See
<a class="reference external" href="http://valgrind.org/docs/manual/cl-manual.html#opt.instr-atstart">here</a>
more details.</p>
</section>
</section>
<section id="visualisation">
<h2>Visualisation<a class="headerlink" href="#visualisation" title="Link to this heading">¶</a></h2>
<p>Callgrind produces a large amount of data about the program’s execution.
It is most easily understood using the <em>kcachegrind</em> GUI tool. This
reads the information produced by callgrind and creates a list of
function calls along with information on timings of each of the calls
during the profiled run. If the source code is available it can also
show the lines of code that relate to the functions being inspected.</p>
<figure class="align-default" id="id1">
<img alt="Example of KCachegrind display a profile of MantidPlot starting up and closing down" src="_images/KCachegrind_MantidPlot.png" />
<figcaption>
<p><span class="caption-text">Example of KCachegrind display a profile of MantidPlot starting up
and closing down</span><a class="headerlink" href="#id1" title="Link to this image">¶</a></p>
</figcaption>
</figure>
<p>By default KCachegrind shows the number of instructions fetched within
its displays. This can be changed using the drop-down box at the top
right of the screen. The <em>Instruction Fetch</em> and <em>Cycle Estimation</em> are
generally the most widely used and roughly correlate to the amount of
time spent performing the displayed functions.</p>
<p>Some of the key features display are:</p>
<section id="flat-profile-view">
<h3>Flat Profile View<a class="headerlink" href="#flat-profile-view" title="Link to this heading">¶</a></h3>
<ul class="simple">
<li><p>Incl. - Sum of itself + all child calls as a percentage of the whole.
Programs with little static allocation should have main() at 100%.
Units are those selected by the to-right drop-down</p></li>
<li><p>Self - Exclusive count spent in the selected function. Units are
those selected by the to-right drop-down</p></li>
<li><p>Called - Number of times the function was called.</p></li>
</ul>
</section>
<section id="function-call-detail">
<h3>Function Call Detail<a class="headerlink" href="#function-call-detail" title="Link to this heading">¶</a></h3>
<p>Click on function in flat view to get more detail on the right.</p>
<p>Displays details about the selected function call plus details about all
child calls it made. The <em>call graph</em> tab at the bottom gives a nice
graphical overview of the relative function cost.</p>
</section>
</section>
</section>
</div>
</div>
</div>
<footer class="footer">
<div class="container">
<ul class="nav navbar-nav" style=" float: right;">
<li>
<a href="ProfilingWithPerf.html" title="Previous Chapter: Profiling with perf"><span class="glyphicon glyphicon-chevron-left visible-sm"></span><span class="hidden-sm hidden-tablet">« Profiling with perf</span>
</a>
</li>
<li>
<a href="FlowchartCreation.html" title="Next Chapter: Flowchart Creation"><span class="glyphicon glyphicon-chevron-right visible-sm"></span><span class="hidden-sm hidden-tablet">Flowchart Creation »</span>
</a>
</li>
<li><a href="#">Back to top</a></li>
</ul>
<p>
</p>
</div>
</footer>
</body>
</html>