This document introduces Tai-e’s abstraction of the Java program being analyzed. You will likely need to use the classes introduced in this document when developing analyses on top of Tai-e. See Section 2 of Tai-e’s paper for more discussions.
-
JClass
(inpascal.taie.language.classes
) represents classes in the program. Each instance contains various information of a class, such as class name, modifiers, declared methods and fields, etc. -
JMethod
andJField
: (inpascal.taie.language.classes
): represents class members, i.e., methods and fields in the program. EachJMethod
/JField
instance contains various information of a method/field, such as declaring class, name, etc. -
ClassHierarchy
(inpascal.taie.language.classes
): manages all the classes of the program. It offers APIs to query class hierarchy information, such as method dispatching, subclass checking, etc. -
Type
(inpascal.taie.language.type
): represents types in the program. It has several subclasses, e.g.,PrimitiveType
,ClassTyp
, andArrayType
, representing different kinds of Java types. -
TypeSystem
(inpascal.taie.language.type
): provides APIs for retrieving specific types and subtype checking. -
World
(inpascal.taie
): manages the whole-program information of the program. By using its getters, you can access these information, e.g.,ClassHierarchy
andTypeSystem
.World
is essentially a singleton class, and you can obtain the instance by callingWorld.get()
.
Tai-e IR is typed, 3-address, statement and expression based representation of Java method body.
You could dump IR for the classes of input program to .tir
files via option -a ir-dumper
. By default, Tai-e dumps IR to its default output directory output/
. If you want to dump IR to a specific directory, just use option -a ir-dumper=dump-dir:path/to/dir
. ir-dumper
is implemented as a class analysis, thus the scope of the classes it dumps are affected by option -scope
.
The IR classes reside in package pascal.taie.ir
and its sub-packages.
There are three core classes in Tai-e IR:
-
IR
is the central data structure of intermediate representation in Tai-e, and each IR instance can be seen as a container of the information for the body of a particular method, such as variables, parameters, statements, etc. You could easily obtain IR instance of a method byJMethod.getIR()
(providing the method is not abstract). -
Stmt
represents all statements in the program. This interface has a dozen of subclasses, corresponding to various statements.Stmt`s are stored in `IR
, and you could obtain them viaIR.getStmts()
. -
Exp
represents all expressions in the program. This interface has dozens of subclasses, corresponding to various expressions.Exp`s are associated with `Stmt`s, and you could obtain them via specific APIs of `Stmt
.
We believe that the API of IR is self-documenting and easy to use. To make IR more intelligible, we present a formal definition (i.e., context-free grammar) below that illustrates all kinds of expressions and statements in the IR, and how Stmt
are formed by Exp
. Most non-terminals in the grammar corresponds to classes in pascal.taie.ir
.
Exp → Var | Literal | FieldAccess | ArrayAccess | NewExp | InvokeExp | UnaryExp | BinaryExp | InstanceOfExp | CastExp
-
Var → Identifier
-
Literal → IntLiteral | LongLiteral | FloatLiteral | DoubleLiteral | StringLiteral | ClassLiteral | NullLiteral | MethodHandle | MethodType
-
FieldAccess → InstanceFieldAccess | StaticFieldAccess
-
InstanceFieldAccess → Var.FieldRef
-
StaticFieldAccess → FieldRef
-
FieldRef → <ClassType: Type FieldName>
-
FieldName → Identifier
-
-
ArrayAccess → Var[Var]
-
NewExp → NewInstance | NewArray | NewMultiArray
-
NewInstance → new ClassType
-
NewArray → new Type[Var]
-
NewMultiArray → new Type LengthList EmptyList
-
LengthList → [Var] | [Var]LengthList
-
EmptyList → ε | []EmptyList
-
-
InvokeExp → InvokeVirtual | InvokeInterface | InvokeSpecial | InvokeStatic | InvokeDynamic
-
InvokeVirtual → invokevirtual Var.MethodRef(ArgList)
-
InvokeInterface → invokeinterface Var.MethodRef(ArgList)
-
InvokeSpecial → invokespecial Var.MethodRef(ArgList)
-
InvokeStatic → invokestatic MethodRef(ArgList)
-
InvokeDynamic → invokedynamic BootstrapMethodRef MethodName MethodType [BootstrapArgList] (ArgList)
-
MethodRef → <ClassType: Type MethodName(TypeList)>
-
MethodName → Identifier
-
TypeList → ε | Type TypeList'
-
TypeList' → ε | , Type TypeList'
-
ArgList → ε | Var ArgList'
-
ArgList' → ε | , Var ArgList'
-
BootstrapMethodRef → MethodRef
-
BootstrapArgList → ε | Literal BootstrapArgList'
-
BootstrapArgList' → ε | , Literal BootstrapArgList'
-
-
UnaryExp → NegExp | ArrayLengthExp
-
NegExp → !Var
-
ArrayLengthExp → Var.length
-
-
BinaryExp → ArithmeticExp | BitwiseExp | ComparisonExp | ConditionExp | ShiftExp
-
ArithmeticExp → Var ArithmeticOp Var
-
ArithmeticOp → + | - | * | / | %
-
BitwiseExp → Var BitwiseOp Var
-
BitwiseOp → "|" | & | ^
-
ComparisonExp → Var ComparisonOp Var
-
ComparisonOp → cmp | cmpl | cmpg
-
ConditionExp → Var ConditionOp Var
-
ConditionOp → == | != | < | > | ⇐ | >=
-
ShiftExp → Var ShiftOp Var
-
ShitOp → << | >> | >>>
-
-
InstanceOfExp → Var instanceof Type
-
CastExp → (Type) Var
Stmt → AssignStmt | JumpStmt | Invoke | Return | Throw | Catch | Monitor | Nop
-
AssignStmt → New | AssignLiteral | Copy | LoadArray | StoreArray | LoadField | StoreField | Unary | Binary | InstanceOf | Cast
-
New → Var = NewExp;
-
AssignLiteral → Var = Literal;
-
Copy → Var = Var;
-
LoadArray → Var = ArrayAccess;
-
StoreArray → ArrayAccess = Var;
-
LoadField → Var = FieldAccess;
-
StoreField → FieldAccess = Var;
-
Unary → Var = UnaryExp;
-
Binary → Var = BinaryExp;
-
InstanceOf → Var = InstanceOfExp;
-
Cast → Var = CastExp;
-
-
JumpStmt → Goto | If | Switch
-
Goto → goto Label;
-
If → if ConditionExp goto Label;
-
Switch → TableSwitch | LookupSwitch
-
TableSwitch → tableswitch (Var) { CaseList default: goto Label; }
-
LookupSwitch → lookupswitch (Var) { CaseList default: goto Label; }
-
Label → IntLiteral
-
CaseList → ε | case IntLiteral: goto Label; CaseList
-
-
Invoke → InvokeExp; | Var = InvokeExp;
-
Return → return; | return Var;
-
Throw → throw Var;
-
Catch → catch Var;
-
Monitor → monitorenter Var; | monitorexit Var;
-
Nop → nop;