Permutation conditions
----------------------
Here i'm trying to define conditions, when it is possible to change order
of some consecutive x86 instructions and instruction blocks,
i.e. swap them, but keep program working the same.
Imagine that we have "ADD EAX, EBX" which is followed by "MOV ECX, EDX",
and we know that both instructions are not the destination of some JMP, CALL,
or any other code/data/algorithmic reference.
In such a case they can be swapped with each other.
Now lets examine conditions under which such a trick can be performed
for some arbitrary consecutive instructions.
1. ONE instruction
Let it consists of:
operation code, source object set, destination object set.
Object set is a bitset of properties, instruction deal with:
common registers: EAX,ECX,EDX,EBX,ESI,EDI,ESP,EBP (8 bits)
flags (1 bit)
memory reference (1 bit)
We will enclose sets into { } brackets,
as such full set is {EAX,EBX,ECX,EDX,ESI,EDI,ESP,EBP,flags,memory}
and empty set is {}.
In binary notation it may look just like a 10-bit number.
Memory reference means that at least any byte in memory can be read/written
by our instruction.
We dont distinguish memory addresses, because of the following:
if there are two instructions,
first one accessing memory at [eax],
second one accessing memory at [ebx],
and we dont know if eax/ebx are equal or not (or 4-byte ranges intersects);
then, it is just a memory reference for us,
and we only know that it could be the same variable.
Other properties, such as EIP, segment-, FPU-, MMX-, SSE-registers,
are not considered here, since limited application of our theory.
We just assume, that no one instruction (except maybe NOP) can be mixed with
INT, RET, CALL, JMP, JXX, or FPU/MMX instructions.
"Source object set" is a set of objects to be read.
"Destination object set" is a set of objects to be written/modified.
Any destination object supersedes source object.
I mean that in "ADD EAX, EBX" instruction, EBX is source object,
and EAX is both source and destination object,
but we say that EAX is destination object only,
and we never say that EAX is source;
however, we always keep in mind that any destination object
can be also (indirectly) source object.
But, if we have "ADD ECX, ECX" instruction, we MUST include 2nd (src) ECX
into source object set.
Lets consider some typical instructions:
INSTRUCTION DESTINATION OBJECT SET SOURCE OBJECT SET
nop
mov R1, R2 R1 R2
cmp R1, R2 flags R1, R2
add R1, R1 R1, flags R1
adc R1, R2 R1, flags R2, flags
mov R1, [R2+R3] R1 R2, R3, memory
add R1, [R2+R3] R1, flags R2, R3, memory
sub [R1+R2], R3 flags, memory R1, R2, R3
test [R1+R2], R3 flags R1, R2, R3, memory
inc R1 R1, flags
lea R1, [R2+R3] R1 R1, R2
xchg R1, [R2] R1, memory R2
push R1 ESP, memory R1
pop R1 ESP, R1 memory
cld flags
lods ESI, EAX flags, memory
rep lods ESI, EAX, ECX, flags flags, memory
stos EDI, memory EAX, flags
rep stos EDI, ECX, memory, flags EAX, flags
rep cmps ESI, EDI, ECX, memory, flags memory
2.1. TWO instructions
let they look as
CMD1 DST1, SRC1
CMD2 DST2, SRC2
where DST1, SRC1, DST2, SRC2 are source/destination object sets.
Any of these sets could be empty.
We also assume that both instructions are already filtered/checked,
i.e. no one of them is JMP, CALL, RET, etc.
So, we have 4 sets, and we have 6 relations between
each 2 of these sets. {C(4,2) = 4!/2!*(4-2)! = 6}
2.2. Simple conditions
Each of six relations is bitwise AND operation on the two bitsets, returning
logical ZERO or NON-ZERO. In other words, each two sets can intersect or not.
#define INTERSECT(bitset_X, bitset_Y) ((bitset_X & bitset_Y) != 0)
Here is 3 conditions, when it is possible to swap two instructions:
(DST1 & DST2) == 0 [1]
(DST1 & SRC2) == 0 [2]
(DST2 & SRC1) == 0 [3]
Other 3 (unused) relations looks as following:
(DST1 & SRC1) == whatever [4]
(DST2 & SRC2) == whatever [5]
(SRC1 & SRC2) == whatever [6]
Example:
add eax, ebx # cmd1=ADD dst1={EAX,flags} src1={EBX}
sub ebx, ecx # cmd2=SUB dst2={EBX,flags} src2={ECX}
[1] == FALSE {EAX,flags} intersects {EBX,flags}
[2] == TRUE {EAX,flags} doesnt intersects {ECX}
[3] == FALSE {EBX,flags} intersects {EBX}
as such, these two instructions can not be swapped.
2.3. Advanced conditions
There exists some advanced conditions, which requires more analysis.
In such a cases, some of [1], [2], [3] conditions can be ignored.
For example:
(CMD1 == CMD2) && (DST1 == DST2) && [2] && [3] [7]
or
#define GROUP_1(cmd) ((cmd == ADD) || (cmd == SUB))
GROUP_1(CMD1) && GROUP_2(CMD2) && (DST1 == DST2) && [2] && [3] [8]
For example (below), [1] is FALSE, but since [8], the following
instructions can be swapped:
add eax, ecx
sub eax, edx
Now, imagine, that we have the following commands:
mov [ebp+4], eax
mov [ebp+8], ecx
Such a situation should be handled in a special way;
otherwise our theory will grow to contain lists of variables in the
object sets; also this will require checking for intersection
of memory ranges & etc.
Except that all, we can ignore condition [1], in cases when instructions
are followed by code which doesn't uses DST1 and DST2 object sets.
But this will require much more analysis, and i think it is not optimal.
3. From single instruction to block of instructions
Imagine that we have these instructions (following each other):
[a] add eax, ecx
[b] adc eax, edx
[c] add ebx, ecx
[d] adc ebx, edx
[e] sub esi, edi
We can not swap [a] and [b], and we can not swap [c] and [d].
But we can swap [ab] and [cd].
As such, we need some entity to replace consecutive instructions with.
4. ONE block of N instructions
CMD1 DST1, SRC1
CMDi DSTi, SRCi
CMDN DSTN, SRCN
If we will consider this block as a single instruction,
how to calculate summary SUM_DST and SUM_SRC object sets?
I.e. sets of objects which are read and written by that code block.
Simplest way is to do BITWISE OR for all SRC/DST[i]; it is correct,
but then the we will lose some possiblities in the future,
as shown in the following example:
mov eax, ebx
add eax, ecx
adc edx, eax
here, SUM_SRC, if calculated as SRC1|SRC2|SRC3, is equal to
EBX|ECX|EAX, however it is not necessary to include EAX into that set,
since it is initialized inside of this block, but not passed into.
So, algorithm of calculating SUM_SRC and SUM_DST is the following:
SUM_SRC = 0
SUM_DST = 0
for(i = 0; i < N; i++) # in the example shown above (N=3):
{ # i = 0 1 2
SUM_SRC |= SRC[i] & (~SUM_DST) # SUM_SRC = {EBX} {EBX,ECX} {EBX,ECX}
SUM_DST |= DST[i] # SUM_DST = {EAX} {EAX} {EAX,EDX}
}
5. TWO BLOCKS
Working with blocks is the same as working with single instructions,
except that SUM_SRC and SUM_DST should be calculated, as it were
shown above.
You can swap two blocks, or block and single instruction, it doesn't matter.
6. How to find out object sets for some given instruction?
Probably, using something like ADE32 and some additional analysis.