-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsrc2srcml.html
298 lines (220 loc) · 11.6 KB
/
src2srcml.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
<PRE>
<!-- Manpage converted by man2html 3.0.1 -->
</PRE>
<H2>SYNOPSIS</H2><PRE>
<B>src2srcml</B> [<B>-hVnizcgv</B>] [<B>-l</B> <I>language</I>] [<B>-d</B> <I>directory</I>] [<B>-f</B> <I>filename</I>] [<B>-s</B>
<I>version</I>] [<B>-x</B> <I>encoding</I>] [<B>-t</B> <I>encoding</I>] [<I>input-source-code-</I>
<I>file</I>]... [<B>-o</B> <I>output-srcML-file</I>]
</PRE>
<H2>DESCRIPTION</H2><PRE>
The program <B>src2srcml</B> translates source-code files into the XML source-
code representation srcML. The srcML format allows the use of XML for
addressing, querying, and transformation of source code. All text from
the original source-code file is preserved including white-space, com-
ments, and preprocessor statements. No preprocessing of the source code
is done. In addition, the tool can be applied to individual source-code
files, or code fragments.
The translation is fast and uses a stream-parsing approach with top-
down parsing and elements issued as soon as they are detected.
The program <B>src2srcml</B> is typically used with <B>srcml2src</B> which converts
from the srcML format back to source code. Conversion of a source-code
file through <B>src2srcml</B> and then through <B>srcml2src</B> produces the original
source-code file. The program <B>srcml2src</B> also provides a set of utili-
ties for working with srcML files, including efficient querying and
transformation of source code.
Standard input may be specified by using the character - in the place
of an input source-code file or providing no input source-code file.
Similarly, standard output may be specified with the - character for
the output srcML file or by not providing an output srcML file.
A source-code language was be specified when input is from standard in-
put.
</PRE>
<H2>OPTIONS</H2><PRE>
<B>-h,</B> <B>--help</B>
Output the help and exit.
<B>-V,</B> <B>--version</B>
Output the version of src2srcml then exit.
<B>-e,</B> <B>--expression</B>
Translates a single, standalone expression.
<B>-n,</B> <B>--archive</B>
Stores all input source files into one srcML archive. Default
with more then one input file, a directory, or the <B>--files-from</B>
option.
<B>--files-from</B>
Treats the input file as a list of source files. Each file is
separately translated and collectively stored into a single sr-
cML archive. The list has a single filename on each line start-
the command <B>iconv</B> <B>-l</B>.
<B>--no-xml-declaration</B>
No output of the default XML declaration. Useful when the output
is to be placed inside another XML document.
<B>--no-namespace-decl</B>
No output of namespace declarations. Useful when the output is
to be placed inside another XML document.
<B>-z,</B> <B>--compress</B>
Output is in compressed gzip format. This format can be direct-
ly, and automatically, read by <B>srcml2src</B>.
<B>-c,</B> <B>--interactive</B>
Default is to use buffered output for speed. For interactive ap-
plications output is issued as soon as parsed.
For input from terminal, interactive is default.
<B>-g,</B> <B>--debug</B>
When translation errors occur src2srcml preserves all text, but
may issue incorrect markup. In debug mode the text with the
translation error is marked with a special set of tags with the
prefix err from the namespace http://www.sdml.info/srcML/srcerr.
Debug mode can also be indicated by defining a prefix for this
namespace URL, e.g., <B>--xmlns:err="http://www.sdml.info/sr-</B>
<B>cML/srcerr"</B>.
<B>-v,</B> <B>--verbose</B>
Conversion and status information to stderr, including encodings
used. Especially useful with for monitoring progress of the the
<B>--files-from</B> option, a directory, or source-code archive (e.g.
tar.gz). The signal SIGUSR1 can be used to toggle this option.
</PRE>
<H2>METADATA OPTIONS</H2><PRE>
This set of options allows control over various metadata stored in the
srcML document.
<B>-l,</B> <B>--language=</B><I>language</I>
The programming language of the source-code file. Allowable val-
ues are C, C++, C#, Java, or AspectJ. The language affects pars-
ing, the allowed markup, and what is considered a keyword. The
value is also stored individually as an attribute in each unit
element.
If not specified, the programming language is based on the file
extension. If the file extension is not available or not in the
standard list, then the program will terminate.
<B>--register-ext=</B><I>extention=language</I>
Sets the extensions to associate with a given language. Note:
<B>-d,</B> <B>--directory=</B><I>directory</I>
The value of the directory attribute is typically obtained from
the path of the input filename. This option allows you to speci-
fy a different directory for standard input or where the direc-
tory is not contained in the input path.
<B>-f,</B> <B>--filename=</B><I>filename</I>
The value of the filename attribute is typically obtained from
the input filename. This option allows you to specify a differ-
ent filename for standard input or where the filename is not
contained in the input path.
<B>-s,</B> <B>--src-version=</B><I>version</I>
Sets the value of the attribute version to <I>version</I>. This is a
purely-descriptive attribute, where the value has no interpreta-
tion by src2srcml.
</PRE>
<H2>MARKUP EXTENSIONS</H2><PRE>
Each extensions to the srcML markup has its own namespace. These are
indicated in the srcML document by the declaration of the specific ex-
tension namespace. These flags make it easier to declare.
<B>--literal</B>
Additional markup of literal values using the element literal
with the prefix "lit" in the namespace "http://www.sdml.info/sr-
cML/literal".
Can also be specified by declaring a prefix for literal names-
pace using the <B>--xmlns</B> option, e.g.,
<B>--xmlns:lit="http://www.sdml.info/srcML/literal"</B>
<B>--operator</B>
Additional markup of operators values using the element operator
with the prefix "op" in the namespace "http://www.sdml.info/sr-
cML/operator".
Can also be specified by declaring a prefix for operator names-
pace using the <B>--xmlns</B> option, e.g.,
<B>--xmlns:op="http://www.sdml.info/srcML/operator"</B>
<B>--modifier</B>
Additional markup of type modifiers using the element modifier
with the prefix "type" in the namespace "http://www.sdml.in-
fo/srcML/modifier".
Can also be specified by declaring a prefix for the modifier
namespace using the <B>--xmlns</B> option, e.g.,
<B>--xmlns:type="http://www.sdml.info/srcML/modifier"</B>
</PRE>
<H2>LINE/COLUMN POSITION</H2><PRE>
Optional line and column attributes are used to indicate the position
There is a set of standard URIs for the elements in srcML, each with a
predefined prefix. The predefined URIs and prefixes for them include
(given in xmlns notation):
PREFIX URI
(nil) http://www.sdml.info/sr-
cML/src
cpp http://www.sdml.info/sr-
cML/cpp
err http://www.sdml.info/sr-
cML/srcerr
lit http://www.sdml.info/sr-
cML/literal
op http://www.sdml.info/sr-
cML/operator
type http://www.sdml.info/sr-
cML/modifier
pos http://www.sdml.info/sr-
cML/position
The following options can be used to change the prefixes.
<B>--xmlns=</B><I>URI</I>
Sets the URI for the default namespace.
<B>--xmlns:</B><I>PREFIX</I><B>=</B><I>URI</I>
Sets the namespace prefix PREFIX for the namespace URI.
These options are an alternative way to turn on options by declaring
the URI for an option. See the MARKUP EXTENSIONS for examples.
</PRE>
<H2>CPP MARKUP OPTIONS</H2><PRE>
This set of options allows control over how preprocessing regions are
handled, i.e., whether parsing and markup occur. In all cases the text
is preserved.
<B>--cpp</B> Turns on parsing and markup of preprocessor statements in non-
C/C++ languages such as Java. Can also be enabled by defining a
prefix for this cpp namespace URL, e.g.,
<B>--xmlns:cpp="http://www.sdml.info/srcML/cpp"</B>.
<B>--cpp-markup-else</B>
Place markup in #else and #elif regions. Default.
<B>--cpp-text-else</B>
Only place text in #else and #elif regions leaving out markup.
<B>--cpp-markup-if0</B>
Place markup in #if 0 regions.
<B>--cpp-text-if0</B>
Only place text in #if 0 regions leaving out markup. Default.
files in srcML archives.
</PRE>
<H2>USAGE</H2><PRE>
To translate the C++ source-code file main.cpp into the srcML file
main.cpp.xml:
<B>src2srcml</B> main.cpp -o main.cpp.xml
To translate a C source-code file main.c into the srcML file
main.c.xml:
<B>src2srcml</B> --language=C main.c -o main.c.xml
To translate a Java source-code file main.java into the srcML file
main.java.xml:
<B>src2srcml</B> --language=Java main.java -o main.java.xml
To specify the directory, filename, and version for an input file from
standard input:
<B>src2srcml</B> --directory=src --filename=main.cpp --version=1 - -o
main.cpp.xml
To translate a source-code file in ISO-8859-1 encoding into a srcML
file with UTF-8 encoding:
<B>src2srcml</B> --src-encoding=ISO-8859-1 --encoding=UTF-8 main.cpp -o
main.cpp.xml
</PRE>
<H2>RETURN STATUS</H2><PRE>
0: Normal
1: Error
2: Problem with input file
3: Unknown option
4: Unknown encoding
6: Invalid language
7: Invalid combination of options
8: Incomplete output due to termination
</PRE>
<H2>CAVEATS</H2><PRE>
Translation is performed based on local information with no symbol ta-
ble. For non-CFG languages, i.e., C/C++, and with macros this may lead
to incorrect markup.
</PRE>
<H2>AUTHOR</H2><PRE>
Written by Michael L. Collard and Huzefa Kagdi
src2srcml 1.0 Wed May 21 10:36:09 EDT 2014 <B>src2srcml(1)</B>
</PRE>
<HR>
<ADDRESS>
Man(1) output converted with
<a href="http://www.oac.uci.edu/indiv/ehood/man2html.html">man2html</a>
</ADDRESS>
</BODY>
</HTML>