|Informative Information for the Uninformed|
Next: Problems with new approach Up: Loop Detection Previous: Problems with Natural Loop   Contents
The reader has seen how to detect dominators within a CFG and how to use that as a component to find natural loops. The previous chapter described why natural loop detection was flawed when trying to detect irreducible loops. For binary auditing, the tool will need to be able to pick up all loops and then let the user deduce whether or not the loops are interesting. This chapter will introduce the loop algorithm used in the IDA plug-in to detect loops.
To come up with an algorithm that was robust enough to detect both loops in the irreducible and reducible loop categories, the author decided to modify the previous definition of a natural loop. The new definition reads "a loop can have multiple entry points and at least one link that creates a cycle." This definition avoids the use of dominators to detect loops in the CFG.
The way this alternative algorithm works is by first making a call
to the is_reference_to(ea_t to, ea_t ref) function. The
function is_reference_to will determine if there is a
reference from the ea_t specified by ref to the parameter
to. This check within the loop detection algorithm
determines if there is a backedge or link that would complete a
loop. The reason this check is done first is for speed. If there is
no reference that would complete a loop then there is no reason to
call is_path_to, thus preventing unnecessary
calculations. However, if there is a link or backedge, a call to the
overloaded function is_path_to(ea_t from, ea_t to) is
used to determine if the nodes that are being examined can even
reach each other. The is_path_to function simulates all
possible code execution conditions by following all possible edges
to determine if the flow of execution could ever reach parameter to
when starting at parameter from. The function
is_path_to(ea_t from, ea_t to) returns one
(true) if there is indeed a path going from from
to to. With both of these functions returning one, it can be
deduced that these nodes are involved in the loop.