-
-
Notifications
You must be signed in to change notification settings - Fork 53
Writing Scripts
- Overview
- Wildcard Support
- Using instruction generators
- Using placeholder functions
- Module Scripts
- Tips & Tricks
All the scripts in WARP are written using QJS - an extended version of standard JS (ES7 + some of the newer features like nullish coalescing).
Since there are a heap of tutorials and documents available on the internet for ES7, I will not be repeating those. (There are some Tips & Tricks listed at the end of the page though).
Instead, we will be focusing on the extra stuff provided by the tool in the form of Inbuilt & Scripted APIs.
Whenever we write scripts be it for patches or extensions, we need to start from a specific address in the Exe.
This address is often found by looking for a specific pattern of code using one of the following functions:
Now let us consider the patterns themselves.
Sometimes it is as simple as:
' 8B C8' //mov ecx, eax
+ ' 6A 12' //push 12
;
But what if we need to generalize it a bit?
Say we don't know which register ECX gets the value from and the value being PUSHed is a number < 16
' 8B <modrm byte>' //mov ecx, <unknownReg>
+ ' 6A <num less than 16>' //push <unknownByte>
;
To express these, the aforementioned 4 functions support the use of wildcard characters (placeholders) in 2 forms:
-
Nibble based (
?
)The question mark character
?
is used for simple partial matches against any nibbles in hex string.For e.g.
-
?1
will match with01
,11
, ...,E0
&F0
-
5?
will match with50
,51
, ...,5E
&5F
-
-
Bit based (
[.]
)For more complex partial matches we make use of the combination of characters
[
,.
and]
where :-
[
serves as the starting delimiter -
.
wildcard character matching any bit -
]
serves as the ending delimiter
For e.g.
-
[...1...1]
will match all bytes with1
at the 4th and 0th position. -
[0.......]
will match all positive bytes (since sign bit is 0)
As you can see, one caveat is that each byte need to be completely represented.
You can have a mix of these 2 wildcards within the same hex string but you cannot mix them up within the same byte.
-
Building on our example from earlier:
-
for the unknown register the last 3 bits of the mod r/m byte would need to wildcards.
-
for the number < 16 the upper nibble is 0 and lower nibble need to be wildcard.
Thus we get the following :
' 8B [11001...]' //mov ecx, <unknownReg>
+ ' 6A 0?' //push <unknownByte>
;
Writing the patterns as shown above is perfectly fine but it becomes cumbersome as the code gets bigger and we need to have more complex patterns.
For e.g. consider the following
//movzx eax, dword ptr [unknReg1 * 4 + unknReg2]
//jmp dword ptr [eax*4 + unknAddr]
//mov unknReg3, <positive number>
//push dword ptr [unknReg3]
To make the pattern for this we need to calculate the mod r/m & sib byte of each instruction.
For one of them even the opcode varies as shown below.
' 0F B7 04 [10......]' //movzx eax, dword ptr [unknReg1 * 4 + unknReg2]
+ ' FF 24 85 ?? ?? ?? 00' //jmp dword ptr [eax*4 + unknAddr]
+ ' [10111...] ?? ?? ?? 00' //mov unknReg3, <positive number>
+ ' FF [00110...]' //push dword ptr [unknReg3]
To make the whole process easier it is recommended to use the Instruction generators to create these hex codes for you.
The same pattern can be written using these functions as
MOVZX(EAX, [4, R32, R32]) //movzx eax, dword ptr [unknReg1 * 4 + unknReg2]
+ JMP([4, EAX, POS3WC]) //jmp dword ptr [eax*4 + unknAddr]
+ MOV(R32, POS3WC) //mov unknReg3, <positive number>
+ PUSH([R32]) //push dword ptr [unknReg3]
POS3WC is a hex string defined by Support scripts and is equal to
?? ?? ?? 00
.
As you can see the functions take a form that is as close to their ASM equivalent as possible, thereby making the code readable as well.
Now let's take a bit of deep dive into how to use them.
There are 3 types of arguments that a generator function accepts:
Argument Type | Description |
---|---|
Immediate value | A number or it's equivalent hex string. It's use varies on the instruction, but it will never appear as the target in any instruction |
Register | An object representing 1 of the CPU registers. It can also be one of the placeholder Register objects. |
Memory Pointer | A location in memory. It is written enclosed within [ ] (mimicing the style of debuggers/disassemblers). |
-
Instructions which take only 1 argument can use all 3 types (most of the time).
-
Most of the CPU instructions (& consequently their generators) take 2 arguments.
In this case the first argument serves as the target location and the second argument serves as the source.
For e.g.
ADD(EDX, ECX) //this is essentially doing EDX = EDX + ECX
-
When a Register is the target location, all 3 types can serve as the source (provided the instruction allows it).
-
When a Memory Pointer is the target, then the source need to be either an Immediate value or a Register.
-
Some instructions also take an additional 3rd argument and this will always be an immediate value.
-
Among the Registers, the one with index 0 is special (for e.g.
EAX
is special and so isST0
).Some instructions take up different opcodes when these are involved, but only when the source location deals with numbers / hex strings alone
i.e. either an immediate value OR a direct memory location in the form
[displacement]
As mentioned in the Register class' page, all the known CPU registers already have their objects created and we should be using those.
Standalone these registers don't have much use. But their power comes forth when used with the [Instruction generators]
Let's see an example to see how they can be specified.
let code =
MOV(ECX, '18') //mov ecx, 18h
+ ADD(ECX, EDX) //add ecx, edx
+ PUSH(ECX) //push ecx
+ CALL(POS4WC) //call func#1
;
POS4WC is a hex string defined by Support scripts and is equal to
?? ?? ?? 0?
.
Most of the time we end up with a situation where we know the size of the register but not which one.
In these cases, we can make use of the Placeholder registers mentioned in the Register page.
For e.g. in our earlier code, changing ECX to <any 32 bit register> we end up with the following :
let code =
MOV(R32, '18') //mov regA, 18h
+ ADD(R32, EDX) //add regA, edx
+ PUSH(R32) //push regA
+ CALL(POS4WC) //call func#1
;
As specified earlier, a memory location is specified by enclosing them within [ ]
.
It takes the general form
[scale, index, base, displacement]
to represent
[scale * index + base + displacement]
Where
-
scale = One of
1
,2
,4
or8
-
index = Secondary indexing Register
-
base = Primary Register holding an address/offset
-
displacement = An address/offset. It can be either a number or an equivalent hex string.
All the 4 parts are optional, but atleast 1 needs to be specified obviously.
Also, the order matters (
[EDX, 4]
is completely different from [4, EDX]
)
Let's see some examples:
MOV(EAX, [EDX, WCp]) //mov eax, dword ptr [edx + disp8A] ; 8 byte displacement
+ ADD(EAX, [R32, EBX]) //add eax, dword ptr [regA + ebx]
+ PUSH(EAX) //push eax
+ MOV(EAX, [POS3WC]) //mov eax, dword ptr [addr#1] ; displacement only
+ CALL([EAX]) //call dword ptr [eax]
+ MOV(EAX, [4, EAX, POS3WC]) //mov eax, dword ptr [eax*4 + addr#2]
;
As you can see the pointer size of the memory location is automatically detected in all of these instructions.
However, there will be scenarios where the size need to be manually specified. In such cases we need to provide the corresponding pre-defined PtrSize object as an argument.
For e.g.
MOV(BYTE_PTR, [ECX, EAX], 6) //mov byte ptr [ecx+eax], 6
Without the BYTE_PTR
argument the location would be treated as a DWORD_PTR
.
Immediate values are just simply integers or equivalent hex string. There is not much more we can say about them other than that.
Let's see an example.
MOV(ECX, 0x10001)
+ ADD(ECX, POS3WC)
;
In addition to the arguments discussed so far, there are also Instruction Prefix objects available to be used as arguments.
As shown in the page the last 3 of them also have functions to wrap over the generators.
Most of the time ADCH
and OP16
requirement can be inferred from the remaining arguments, so we do not need to be specify them explicitly.
The others however need to be explicitly specified.
Let's see some examples
MOV(EAX, FS, [0]) //mov eax, dword ptr fs:[0]
MOV(ES, [4, ECX, 90], DX) //mov word ptr es:[ecx*4 + 90h], dx
LOCK(NEG(EAX)) //lock neg eax
As you can these prefixes can provided in any position.
We have already seen many examples. But here is a couple more to get a feel for the whole thing.
Instruction in ASM | Equivalent in code | Comments |
---|---|---|
mov esi, ecx |
MOV(ESI, ECX) |
|
mov edi, 1000h |
MOV(EDI, 0x1000) |
|
push ebx |
PUSH(EBX) |
|
push <16 bit reg> |
PUSH(R16) |
|
push <32 bit reg> |
PUSH_R |
Here we are using a value from the list of pre-defined instructions |
mov al, byte ptr ds:[bx + di + 30] |
MOV(AL, [BX, DI, 0x30]) |
Since AL is the target, the size is already known so need to explicitly specify BYTE_PTR
|
cmp byte ptr ds:[4*edx], 1 |
CMP(BYTE_PTR, [4, EDX], 1) |
Immediate value is treated as DWORD by default, so we need to specify BYTE_PTR here. |
mov word ptr ds:[e010e0], ff00h |
MOV(WORD_PTR, [0xE010E0], 0xFF00) |
Similar situation as above. The presence of WORD_PTR also adds the Instruction prefix required. |
movzx ebx, al |
MOVZX(EBX, AL) |
|
lock inc dword ptr fs:[ecx + eax] |
LOCK(INC(FS, [ECX, EAX])) |
Using the FS prefix to indicate the fs segment for the memory location, while lock ing the access to other operations. |
The necessity for placeholders is not only limited to searching patterns.
While constructing new code for replacement/insertion with Exe.SetHex/Exe.AddHex functions, we often come across the similar scenario.
For example, consider the following code:
let newFunc =
PUSH([<memAddr>]) //push dword ptr [memAddr]
+ CALL(<funcAddr>) //call func#1
+ RETN() //retn
;
let [free, freeVir] = Exe.FindSpace(newFunc.byteCount() + 4);
The code is going to be placed at the free
address found.
Here we will need placeholders for the first statements because:
-
'memAddr' is the address after the
RETN
which is not known before-hand. -
'funcAddr' is an existing address, however the place it is being called from is not known yet.
Since a direct
CALL
is essentiallyE8 <addr difference>
we have an issue here as well.
The solution: Use the Placeholder functions
To generate a 'filler pattern', we need only call the Filler function. It creates a quasi-hex string using _
for unknowns.
All of the [Instruction generators] accept these filler patterns.
Also you can use the functions <string>.isHex & <string>.byteCount with these as well.
let newFunc =
PUSH([Filler(1)]) //push dword ptr [memAddr]
+ CALL(Filler(2)) //call func#1
+ RETN() //retn
;
let [free, freeVir] = Exe.FindSpace(newFunc.byteCount() + 4);
Since we do not need to manually handle the pattern generated by the Filler function, I am not going into detail about it.
Along with the index (for identification), we can also specify the byte size of the pattern required (as the second argument).
If no byte size is provided then we get a 4-byte filler pattern.
Now that the filler patterns have served their purpose it is time to substitute them with their actual values.
Based on how the pattern was used there are 2 functions that we can use.
-
For a simple substitution we make use of this function. It basically swaps the filler patterns with the values provided.
For e.g.
let code = PUSH(Filler(1)) + PUSH(Filler(2,1)) ; .... ///some statements later code = SwapFillers(code, { 1: freeVir + 4, '2,1': 40 });
As shown, if you need to specify both index & byte size then use a string.
-
A more interesting usage of fillers is when you need to specify call or jump targets.
For e.g.
let code = PUSH(5) //push 5 + CALL(Filler(1)) //call funcA + MOV(EDX, EAX) //mov edx, eax + PUSH(10) //push 0Ah + CALL(Filler(1)) //call funcA ;
Here we are using index
1
to representfuncA
. But we can't substitute it directly with the address offuncA
.Instead, we need the difference between
funcA
address and the location where it is called from.Calculating this value manually is a hassle, which is where the SetFillTargets function comes into play.
It will do the calculation for us and we simply need to specify the starting address of the code and the target address for each of the indices.
For the above example it would look like
code = SetFillTargets(code, { 'start': refAddr, //address of push 5 1: funcAddr });
Another use case for SetFillTargets is when you have conditional short jumps.
For e.g.
let parts = [ CMP(ECX, EAX) //cmp ecx, eax + JNE(Filler(1,1)) //jne _next + PUSH(ECX) //push ecx + MOV(ECX, refOffset) //mov ecx, <refOffset> + CALL(Filler(2)) //call funcA + TEST(EAX, EAX) //test eax, eax + JZ(Filler(1,1)) //jz _next + POP(EBP) //pop ebp + RETN() //retn , MOV(ECX, EAX) //mov ecx, eax ; _next //... more code but you get the point ];
For the
JZ
&JNE
the target is the same and it is within a few bytes (hence 1 byte fillers)In this case we can use SetFillTargets as shown below.
let code = SetFillTargets(parts, { '1,1': parts.byteCount(0) })
Since the calculation is relative rather than absolute this simply works.
In case you are confused, think of it this way.
Assume that the code starts at the address0
.
Since_next
comes right after parts[0], it's address isparts.byteCount(0)
Now you know why we split up the code into parts.
Building on our initial example, we get :
let newFunc =
PUSH([Filler(1)]) //push dword ptr [memAddr]
+ CALL(Filler(2)) //call func#1
+ RETN() //retn
;
const csize = newFunc.byteCount();
let [free, freeVir] = Exe.FindSpace(csize + 4);
newFunc = SwapFiller(newFunc, {1: freeVir + csize}); //location after the RETN
newFunc = SetFillTarget(newFunc, {2: funcAddr}); //for the CALL
//the ALLZRO is not mandatory since we have already allocated the space.
Exe.AddHex(free, newFunc + ALLZRO);
Modules in WARP helps to organize variables, constants and functions under one roof and provides a Singleton object as the root for all of them.
Some key points to note:
-
To define modules you need to make use of the suffix
.mjs
for script files. -
The
.mjs
file needs to define the name of the module using a comment of the form
// MODULE_NAME => <name>
This is the Singleton Object that gets created after the file is loaded.
-
MJS files are automatically loaded by WARP before any of the QJS & EJS files. This helps to access the singleton and it's members anywhere freely in the QJS scripts.
-
MJS files get loaded exactly once by the tool irrespective of whether the client or the file itself got changed
-
Any value defined as a
const
in a module remains a constant throughout the tool's execution time. -
All functions, variables, constants etc that need to be accessed outside of the Module, need to be
export
ed, just like in regular JS.Everything else is considered local to the module.
-
All of the
export
ed members can be accessed only through the defined Singleton. There is no way to import them.
-
Array stuff
- Prefer to use the
for of
loop when you just need to perform some action for all the values in an array.
for (const val of arr) { ... }
-
If you also need the index then use the
forEach
member function. -
One more difference here is that instead of a
continue
statement you need to usereturn
arr.forEach( (val, index) => { if(<condition>) { return; //equivalent of 'continue' } .... });
- Prefer to use the
-
As you saw, one caveat with
forEach
is that you have no way to stop the iterations mid-way.If you need to break in between, then use the
find
member function instead.arr.find( (val, index) => { if (<some condition>) { return true; //equivalent of 'break' }; if (<other condition>) { return false; //equivalent of 'continue' } });
The
find
function returns the element for the stopped iteration, but saving it is upto you. -
If instead, you only want to keep track of the iterations which got completed succesfully, you can use the
filter
function instead.let result = arr.filter( (val, index) => { if (<cannot use this>) { return false; //equivalent of 'continue' } return true; //indicate succesful iteration });
Consequently, you can swap the
true
andfalse
to keep track of the iterations that failed.
.... to be continued